An embedding is a list of numbers (typically 768 to 3,072 dimensions) that represents the meaning of a piece of text. Two pieces of text with similar meaning produce vectors that are close together in that high-dimensional space.
You generate embeddings by passing text through a model trained for that purpose — OpenAI's text-embedding-3-large, Cohere's embed-v3, or open-source models like BGE and E5. Once you have vectors, you can do similarity search (find the most similar paragraph in a million-document corpus in milliseconds), clustering (group similar tickets), and semantic deduplication (find near-duplicate content).
For most production use cases, the embedding model matters less than people think — the bigger wins come from chunking strategy, hybrid search (combining vector + keyword), and re-ranking the top candidates with a cross-encoder.
Bring this to your business
Knowing the term is one thing. Shipping it is another.
We do two-week AI Sprints — one term, one workflow, into production by Day 10.