Glossary

What is embedding?

An embedding is a numeric vector that represents the meaning of text, code, or an image, so similar items sit close together in vector space and can be compared mathematically.

An embedding is produced by an embedding model that maps an input — a sentence, a document, a snippet of code — to a fixed-length vector of numbers. Inputs with similar meaning land near each other, which turns 'is this similar?' into a fast distance calculation. Embeddings are the foundation of semantic search, recommendations, clustering, and retrieval-augmented generation.

Open-source embedding models trade off dimensionality, speed, language coverage, and domain fit. The right choice depends on your data: a general model works for prose, while code or a specialised domain often benefits from a model trained for it.

Best vector databases →

Trending embedding projects

VectifyAI/PageIndex
📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG
★ 32.1K+69 · 24hmomentum 9Python
Tencent/WeKnora
Open-source LLM knowledge platform: turn raw documents into a queryable RAG, an autonomous reasoning agent, and a self-maintaining Wiki.
★ 15.5K+11 · 24hmomentum 9Go
qdrant/qdrant
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
★ 31.6K0 · 24hmomentum 4Rust
thedotmack/claude-mem
Persistent Context Across Sessions for Every Agent – Captures everything your agent does during sessions, compresses it with AI, and injects relevant context back into future sessions. Works with Claude Code, OpenClaw, Codex, Gemini, Hermes, Copilot, OpenCode + More
★ 78.2K0 · 24hmomentum 4TypeScript
onyx-dot-app/onyx
Open Source AI Platform - AI Chat with advanced features that works with every LLM
★ 29.8K0 · 24hmomentum 4Python
yichuan-w/LEANN
[MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.
★ 11.7K+25 · 24hmomentum 3Python
memvid/memvid
Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.
★ 15.6K+6 · 24hmomentum 1Rust

▌ embedding — FAQ

What is embedding?

An embedding is a numeric vector that represents the meaning of text, code, or an image, so similar items sit close together in vector space and can be compared mathematically. An embedding is produced by an embedding model that maps an input — a sentence, a document, a snippet of code — to a fixed-length vector of numbers. Inputs with similar meaning land near each other, which turns 'is this similar?' into a fast distance calculation. Embeddings are the foundation of semantic search, recommendations, clustering, and retrieval-augmented generation.

Trending embedding projects

VectifyAI/PageIndex

Tencent/WeKnora

qdrant/qdrant

thedotmack/claude-mem

onyx-dot-app/onyx

yichuan-w/LEANN

memvid/memvid

▌ embedding — FAQ