Skip to content

SnapIndex

Training-free scalar-quantized index. Implements TurboQuant: randomized Hadamard transform followed by Lloyd-Max scalar quantization. Works out of the box on any vector distribution, no calibration or corpus sample required.

When to use

  • You don't want to run a one-off fit step.
  • dim * N fits comfortably in RAM even at float32.
  • Recall target is 0.92-0.95 at 6-12x compression.

Basic usage

import numpy as np
from snapvec import SnapIndex

corpus = np.random.randn(10_000, 384).astype(np.float32)
idx = SnapIndex(dim=384, bits=4, seed=0)
idx.add_batch(list(range(10_000)), corpus)

query = np.random.randn(384).astype(np.float32)
hits = idx.search(query, k=10)

normalized is an optimization, not a default

If your embeddings are already unit-length (for example, cosine-space outputs from most modern sentence encoders), pass normalized=True to skip the internal L2 normalization step. With raw vectors (like the example above), leave it at the default False. Passing normalized=True on non-unit inputs silently skips normalization and scores will not match cosine similarity.

Bits guidance

Pick 4-bit unless you have a specific reason:

bits Compression Recall@10 on real embeddings Notes
2 11.6x ~0.83 Only for aggressive compression
3 7.8x ~0.92 Middle ground; tightly packed since v0.3
4 5.9x ~0.95 Default, recommended

Unbiased-estimator mode

For use cases that need unbiased inner-product estimates (KV-cache, attention), pass use_prod=True. Applies the QJL correction at the cost of roughly 2x search latency:

idx = SnapIndex(dim=dim, bits=3, use_prod=True)

File format

On-disk extension: .snpv. CRC32-checksummed, atomic writes (temp file + rename).

API

See SnapIndex API reference.