Prompting

Top-P (Nucleus Sampling)

A sampling strategy that limits token selection to the smallest set covering a cumulative probability.

Full Definition

Top-p sampling, also called nucleus sampling, dynamically selects the vocabulary to sample from at each step: only tokens whose cumulative probability mass reaches the threshold p are considered. At top-p=0.9, the model samples exclusively from whichever tokens collectively account for 90% of the probability mass — a small set for high-confidence predictions, a larger set when the model is uncertain. This adapts vocabulary size to context, unlike top-k which uses a fixed count. Top-p and temperature address the same problem (diversity vs. focus) from different angles; it is generally recommended to tune one and leave the other at its default.

Examples

With top-p=0.95 on a factual prompt, the sampler might consider only 3–5 tokens; on an open-ended creative prompt it might consider 500+ tokens.

Setting top-p=0.1 to force the model to pick only its highest-confidence continuations when writing boilerplate code.

Apply this in your prompts

PromptITIN automatically uses techniques like Top-P (Nucleus Sampling) to build better prompts for you.

✦ Try it free

Related Terms

Temperature

A sampling parameter that controls the randomness and creativity of model output…

View →

Logit

The raw, unnormalised score a model assigns to each vocabulary token before conv…

View →

Softmax

A function that converts a vector of real numbers into a probability distributio…

View →

← Browse all 100 terms