Min-p Sampling Approaches Comparison
This repository contains test implementations comparing two approaches for probability sampling with minimum thresholds, as discussed in FlashInfer Issue #710.
Background
The testing was motivated by a discussion about different sampling approaches in the context of FlashInfer's implementation. The key considerations are:
-
Rejection Sampling (original approach):
- Samples first, then checks against threshold
- May require multiple attempts
- Similar to truncated normal distribution sampling
- More efficient when threshold region contains most probability mass
-
Direct Filtering (Aikitora's approach):
- Filters probabilities first based on threshold
- Samples once from remaining probabilities
- More efficient when threshold region is small
- Avoids multiple sampling attempts
Implementation
The repository contains two versions of the comparison:
-
Version 1:
- Basic implementation of both approaches
- Simple distribution comparison
-
Version 2:
- Enhanced with numpy optimizations
- Added timing measurements
- Includes KL divergence calculation
- More detailed output formatting
Key Findings
Based on the testing:
- Direct filtering approach shows better performance for cases where the threshold region represents a small portion of the probability mass
- Rejection sampling may be more suitable when the threshold region contains most of the probability mass
- The choice of approach depends on the specific use case and probability distribution characteristics
Related Discussion
See the original GitHub issue for more context:
FlashInfer Issue #710
Usage
To run the tests: