A way of representing text (or other data) as lists of numbers that capture meaning, enabling similarity search and semantic operations.
Embeddings turn text into long lists of numbers (vectors) that capture meaning. Words and sentences with similar meanings end up with similar number patterns. This lets computers compare meaning mathematically — finding related documents even when they don't share exact words. 'Dog' and 'puppy' have similar embeddings even though they're different words.
Imagine you're organizing books in a library where shelf position indicates topic. Two books on dogs would be right next to each other; a dog book and a car book would be far apart. Embeddings work similarly, but in thousands of dimensions — every piece of text gets a location in 'meaning space,' and nearby locations mean similar meanings.
Embeddings are dense vector representations of text (typically 384-3072 dimensions) produced by neural networks trained on large text corpora. Semantic similarity is measured by vector distance (cosine similarity or dot product). Modern embedding models like OpenAI's text-embedding-3, Cohere embed-v3, or open-source alternatives like sentence-transformers are trained via contrastive learning on sentence pairs, teaching them to produce similar vectors for semantically similar text. Embeddings enable semantic search, clustering, classification, and are foundational to RAG systems.
Searching for 'cheap tropical vacation' returns results about 'affordable Caribbean trips' even without word overlap.
Netflix and Spotify use embeddings to find content similar to what you already like.
Your question's embedding is compared to document embeddings to find the most relevant sources.
Group thousands of customer support tickets by topic without labels — embeddings reveal natural groupings.
Embedding models are specialized neural networks trained specifically to produce vector representations — they don't generate text. LLMs like GPT or Claude generate text. Many LLMs also have embedding APIs that produce vectors from their internal representations. For most applications, dedicated embedding models like text-embedding-3 or Cohere's embed models perform better than extracting from LLMs.
Modern embeddings range from 384 dimensions (small, fast models) to 3072 dimensions (OpenAI's text-embedding-3-large). Bigger isn't always better — 1024-1536 dimensions is a common sweet spot. The right size depends on your use case: search quality, storage costs, and inference speed all matter.
Yes. Modern embedding models like text-embedding-3 support 100+ languages and can even find similarities across languages. A Spanish sentence and its English translation will have very similar embeddings, enabling cross-lingual search. Quality varies — English still leads, but major languages are well-supported.
A technique that lets AI models look up information before answering, improving accuracy and reducing hallucinations.
📚A neural network trained on massive text data to understand and generate human-like language.
✍️The skill of writing instructions to AI models to get the best possible output.
⚙️The neural network architecture behind modern AI — introduced by Google in 2017 and now powers ChatGPT, Claude, and most other LLMs.
Our free AI course teaches you to use these ideas in real projects.
Start Free AI Course →