Author here. I kept writing the same glue code every time I needed to get
markdown, MkDocs output, or a JSON file into Meilisearch for search on a
docs or blog site. content-mill is what I extracted after the third time.
You write a YAML config that describes your sources and the shape of the
Meilisearch documents you want, and it handles extraction, templating
(handlebars-style with filters like truncate/slugify/strip_md), heading-
level chunking, and atomic zero-downtime index swap.
Four source types right now: mkdocs, markdown-dir, json, html. The
templating layer means you're not locked into any particular schema —
your search index looks exactly the way your frontend expects.
It's v0.1, MIT, used in production on my own site. Happy to add source
types (AsciiDoc, RST, Notion export, framework-specific formats) if
there's interest — drop a comment or open an issue.
Author here. I kept writing the same glue code every time I needed to get markdown, MkDocs output, or a JSON file into Meilisearch for search on a docs or blog site. content-mill is what I extracted after the third time.
You write a YAML config that describes your sources and the shape of the Meilisearch documents you want, and it handles extraction, templating (handlebars-style with filters like truncate/slugify/strip_md), heading- level chunking, and atomic zero-downtime index swap.
Four source types right now: mkdocs, markdown-dir, json, html. The templating layer means you're not locked into any particular schema — your search index looks exactly the way your frontend expects.
It's v0.1, MIT, used in production on my own site. Happy to add source types (AsciiDoc, RST, Notion export, framework-specific formats) if there's interest — drop a comment or open an issue.
Install: npm install @centrali-io/content-mill Docs: https://github.com/blueinit/content-mill Dev.to: https://dev.to/itsmarydan/stop-writing-custom-scrapers-index...