Technical

Embedding

A dense numerical vector that represents a token, sentence, or document in a continuous semantic space.

Full Definition

An embedding is a fixed-length numerical vector that encodes semantic meaning such that similar concepts are geometrically close in the vector space. Word embeddings (Word2Vec, GloVe) capture word-level semantics; sentence and document embeddings (from models like text-embedding-ada-002 or E5) represent longer spans. LLMs internally use token embeddings at every layer; these evolve through the network from surface-level lexical representations to rich contextual ones. Embeddings are the foundation of semantic search, clustering, classification, and RAG systems — documents are encoded as embeddings and retrieved by computing vector similarity (cosine or dot product) rather than keyword matching.

Examples

Using OpenAI's text-embedding-ada-002 to convert 10,000 product descriptions into 1536-dimensional vectors, then finding the 5 most similar products to a user query via cosine similarity.

Word embeddings capturing that 'Paris' − 'France' + 'Germany' ≈ 'Berlin' in vector arithmetic.

Apply this in your prompts

PromptITIN automatically uses techniques like Embedding to build better prompts for you.

✦ Try it free

Related Terms

Vector Database

A database optimised for storing and querying high-dimensional embedding vectors…

View →

RAG (Retrieval-Augmented Generation)

Augmenting model responses by retrieving relevant documents from an external kno…

View →

Tokenisation

The process of splitting text into tokens using a vocabulary and encoding algori…

View →

← Browse all 100 terms