VERBALIZED SAMPLING: HOW TO MITIGATE MODE COLLAPSE AND UNLOCK LLM DIVERSITY
Source: 00_Raw/2510.01171v3.pdf
Overview
The paper identifies typicality bias in human preference data as a primary cause of mode collapse in aligned LLMs. To mitigate this, it introduces Verbalized Sampling (VS), a prompting strategy where the model explicitly verbalizes a probability distribution over a set of possible responses before sampling.
Key Concepts
Typicality Bias & Mode Collapse
- Typicality Bias: Human annotators tend to favor responses that are familiar, fluent, and predictable (typical), even if they are not the most useful or creative.
- Mathematical Model: The reward function $r(x, y)$ is modeled as a combination of true task utility $r_{true}$ and typicality bias:
$$r(x, y) = r_{true}(x, y) + \alpha \log \pi_{ref}(y | x) + \epsilon(x)$$ where $\alpha$ is the typicality weight.
- Mode Collapse: RLHF optimization sharpens the reference distribution $\pi_{ref}$ by a factor $\gamma = 1 + \alpha/\beta$, where $\beta$ is the KL coefficient. This compresses probability mass toward typical completions.
Verbalized Sampling (VS) Primitives
Instead of a direct prompt, VS reformulates the prompt to ask for multiple responses ($k$) with their corresponding probabilities:
- VS-Standard: Single-turn request for $k$ responses and probabilities.
- VS-CoT: Combines VS with Chain-of-Thought (e.g., "Think step-by-step, then tell 5 jokes with probabilities"). This often achieves the highest quality/diversity Pareto front.
- VS-Multi: Elicits responses across multiple turns (e.g., "Tell 5 jokes... then tell 5 more"). This reduces cognitive burden and improves diversity in larger models.
- Mechanism: Verbalizing probabilities helps the model bypass the collapsed mode of the policy and access the broader distribution learned during pre-training.
Specific Treatments
P% Verbalization (Diversity Tuning)
The paper details a method for Diversity Tuning using probability thresholds:
- P% Constraint: Prompting the model to "sample from the tail distribution, where each response/word should be < p%".
- Findings: Lowering $p$ (e.g., from 1.0 down to 0.001) significantly increases output diversity. Lower thresholds (e.g., $p < 0.01$) can lead to empty outputs in constrained answer spaces (like Open-Ended QA) but are highly effective for creative writing.
- Significance: This provides a practical, inference-time mechanism for fine-grained diversity control without altering decoding parameters like temperature or top-p.
Empirical Findings & Ablations
- Scaling: More capable models (e.g., GPT-4.1, Gemini 2.5 Pro, Claude 4) benefit significantly more from VS than smaller models.
- Orthogonality: VS performance gains are orthogonal to temperature scaling and decoding strategies like top-p and min-p sampling, allowing them to be combined for further improvement.
- Benchmarks: Verified across Creative Writing (PoemHunter/BookMIA), Dialogue Simulation (PersuasionForGood), and Open-Ended QA (CoverageQA).
- Synthetic Data: VS improves downstream math task performance when used for synthetic data generation (mix of correct and diverse incorrect reasoning paths).
Mode-Anchored Departure (Approach B)
*Note: This specific terminology appears to be an implementation-level abstraction (e.g., in 02_System/verbalized-sampling.ps1) derived from the paper's "Distribution-level prompt" and its treatment of reference distributions.*
- Definition: A two-call pipeline where the first call establishes a Modal Anchor (the most probable default response) and then enumerates departures from that anchor.
- Approach B vs. A: In this context, Approach B uses the modal anchor as an explicit reference point for calculating "departure distance," ensuring that tail responses are substantively different from the collapsed default.
- Canonical Setting:
TailStart=7(focusing on the most distant ranks 7-9) is the preferred configuration for maximizing the extraction of latent, non-obvious knowledge.
Empirical Gains
- Creative Writing: Increases diversity by 1.6–2.1× over direct prompting.
- Open-Ended QA: Improves pre-training distribution alignment (lower KL divergence) and increases answer coverage.
- Safety & Factuality: VS maintains safety and factual accuracy comparable to baseline methods while unlocking diversity.
Connections
- agentic-rag: VS can be used to generate diverse search queries or synthetic data for RAG.
- knowledge-gardening-principles: Unlocking tail knowledge is essential for deep synthesis and avoiding "echo chamber" effects in automated wikis.
*Created during ingestion of 2510.01171v3.pdf.*