Skip to content

Choosing an index

If you... Pick Why
Just want a dependency-minimal ANN with no training SnapIndex Works on any distribution, no fit call. 4-bit default is a good balance.
Need recall above 0.95 without a training pass ResidualSnapIndex Two-stage scalar quantization + optional rerank, still training-free.
Have a corpus sample and want aggressive compression PQSnapIndex Learned codebooks reach 16-32 B/vec with recall far above scalar at matched bytes.
Need sub-linear search above ~100k vectors IVFPQSnapIndex Partitioned search visits nprobe / nlist of the corpus. With rerank_candidates it breaks the PQ recall ceiling.

Sizing rules of thumb

Parameter Required? Guidance
bits (SnapIndex) no, defaults to 4 4 for ~95% recall at 6x compression. 3 for a sweet spot around 7.8x. 2 only for huge corpora where scale beats precision.
M (PQ) yes Higher M = higher recall, more disk. Starting point dim // 4 (e.g. M=96 for dim=384); many users ship with M=16-32 for aggressive compression.
K (PQ) no, defaults to 256 Leave at 256 (one byte per sub-index).
nlist (IVF) yes Target 4 * sqrt(N). E.g. N=57k -> nlist=512, N=1M -> nlist=4096.
nprobe (IVF) no, defaults to nlist // 16 Trades recall for latency; tune per query.
rerank_candidates (IVF) no, defaults to None Pass 100 to rerank the PQ candidates with the stored fp16 vectors. Raises recall toward the float32 ceiling. Requires keep_full_precision=True at construction.

Bits vs recall (SnapIndex)

bits Compression vs float32 Recall@10 on real embeddings
2 11.6x ~0.83 (synthetic), higher on clustered real data
3 7.8x ~0.92
4 5.9x ~0.95

When does IVF-PQ pay off?

  • N < 10k: SnapIndex or PQSnapIndex full-scan is fine.
  • N = 50k-100k: PQSnapIndex + scan or IVFPQSnapIndex -- both feasible.
  • N >= 100k: IVFPQSnapIndex is the clear winner.
  • N >= 500k: IVFPQSnapIndex is effectively required -- float32 brute-force starts hitting RAM and latency walls.

See the benchmarks page for measured numbers.