A polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 97+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, R, C, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.
A polyglot document intelligence framework with a Rust core. Extract text, metadata, ima…
Extract text, metadata, and code intelligence from 90+ file formats and 300+ programming languages at native speeds without needing a GPU. Key Features Code intelligence – Extract functions, classes, imports, symbols, and docstrings from 300+ programming languages via tree sitter. It has reached 8,384 GitHub stars, written primarily in Rust.
Why now: Sustained developer attention keeps it in the tracked pool; GitHub activity is the current lead signal.
Considerations: Solid adoption (8,384 stars) but quiet cross-source signal right now — established utility more than a current breakout.
EARLY MOMENTUM · Research: Adoption is real but cross-source confirmation is thin — a short hands-on trial (Rust) will tell you more than the metrics.
Sources: kreuzberg-dev/kreuzberg on GitHub · Project homepage
Methodology: synthesized from this project's own documentation, live GitHub data, third-party coverage, and multi-platform signal convergence — by AISO.tools.
git clone https://github.com/kreuzberg-dev/kreuzberg.gitThen follow the README in the cloned directory.
//COMMENTS · 0
Sign in to join the discussion