NOTE

VERBALIZED SAMPLING: HOW TO MITIGATE MODE COLLAPSE AND UNLOCK LLM DIVERSITY

authorgemini-cli aliasesverbalized-sampling-paper titleVERBALIZED SAMPLING: HOW TO MITIGATE MODE COLLAPSE AND UNLOCK LLM DIVERSITY statusactive date2026-04-28 typeliterature

VERBALIZED SAMPLING: HOW TO MITIGATE MODE COLLAPSE AND UNLOCK LLM DIVERSITY

Source: 00_Raw/2510.01171v3.pdf

Overview

The paper identifies typicality bias in human preference data as a primary cause of mode collapse in aligned LLMs. To mitigate this, it introduces Verbalized Sampling (VS), a prompting strategy where the model explicitly verbalizes a probability distribution over a set of possible responses before sampling.

Key Concepts

Typicality Bias & Mode Collapse

  • Typicality Bias: Human annotators tend to favor responses that are familiar, fluent, and predictable (typical), even if they are not the most useful or creative.
  • Mathematical Model: The reward function $r(x, y)$ is modeled as a combination of true task utility $r_{true}$ and typicality bias:

    $$r(x, y) = r_{true}(x, y) + \alpha \log \pi_{ref}(y | x) + \epsilon(x)$$ where $\alpha$ is the typicality weight.

  • Mode Collapse: RLHF optimization sharpens the reference distribution $\pi_{ref}$ by a factor $\gamma = 1 + \alpha/\beta$, where $\beta$ is the KL coefficient. This compresses probability mass toward typical completions.

Verbalized Sampling (VS) Primitives

Instead of a direct prompt, VS reformulates the prompt to ask for multiple responses ($k$) with their corresponding probabilities:

  • VS-Standard: Single-turn request for $k$ responses and probabilities.
  • VS-CoT: Combines VS with Chain-of-Thought (e.g., "Think step-by-step, then tell 5 jokes with probabilities"). This often achieves the highest quality/diversity Pareto front.
  • VS-Multi: Elicits responses across multiple turns (e.g., "Tell 5 jokes... then tell 5 more"). This reduces cognitive burden and improves diversity in larger models.
  • Mechanism: Verbalizing probabilities helps the model bypass the collapsed mode of the policy and access the broader distribution learned during pre-training.

Specific Treatments

P% Verbalization (Diversity Tuning)

The paper details a method for Diversity Tuning using probability thresholds:

  • P% Constraint: Prompting the model to "sample from the tail distribution, where each response/word should be < p%".
  • Findings: Lowering $p$ (e.g., from 1.0 down to 0.001) significantly increases output diversity. Lower thresholds (e.g., $p < 0.01$) can lead to empty outputs in constrained answer spaces (like Open-Ended QA) but are highly effective for creative writing.
  • Significance: This provides a practical, inference-time mechanism for fine-grained diversity control without altering decoding parameters like temperature or top-p.

Empirical Findings & Ablations

  • Scaling: More capable models (e.g., GPT-4.1, Gemini 2.5 Pro, Claude 4) benefit significantly more from VS than smaller models.
  • Orthogonality: VS performance gains are orthogonal to temperature scaling and decoding strategies like top-p and min-p sampling, allowing them to be combined for further improvement.
  • Benchmarks: Verified across Creative Writing (PoemHunter/BookMIA), Dialogue Simulation (PersuasionForGood), and Open-Ended QA (CoverageQA).
  • Synthetic Data: VS improves downstream math task performance when used for synthetic data generation (mix of correct and diverse incorrect reasoning paths).

Mode-Anchored Departure (Approach B)

*Note: This specific terminology appears to be an implementation-level abstraction (e.g., in 02_System/verbalized-sampling.ps1) derived from the paper's "Distribution-level prompt" and its treatment of reference distributions.*

  • Definition: A two-call pipeline where the first call establishes a Modal Anchor (the most probable default response) and then enumerates departures from that anchor.
  • Approach B vs. A: In this context, Approach B uses the modal anchor as an explicit reference point for calculating "departure distance," ensuring that tail responses are substantively different from the collapsed default.
  • Canonical Setting: TailStart=7 (focusing on the most distant ranks 7-9) is the preferred configuration for maximizing the extraction of latent, non-obvious knowledge.

Empirical Gains

  • Creative Writing: Increases diversity by 1.6–2.1× over direct prompting.
  • Open-Ended QA: Improves pre-training distribution alignment (lower KL divergence) and increases answer coverage.
  • Safety & Factuality: VS maintains safety and factual accuracy comparable to baseline methods while unlocking diversity.

Connections

  • agentic-rag: VS can be used to generate diverse search queries or synthetic data for RAG.
  • knowledge-gardening-principles: Unlocking tail knowledge is essential for deep synthesis and avoiding "echo chamber" effects in automated wikis.

*Created during ingestion of 2510.01171v3.pdf.*

Related