Skip to content

SnapIndex

SnapIndex

SnapIndex(dim: int, bits: int = 4, seed: int = 0, use_prod: bool = False, chunk_size: int | None = None, normalized: bool = False)

Bases: FreezableIndex

Compressed in-memory ANN index using randomized Hadamard + Lloyd-Max.

Parameters:

Name Type Description Default
dim int

Embedding dimension (e.g. 384 for BGE-small, 1536 for OpenAI ada-002).

required
bits int

Bits per coordinate: 2, 3, or 4. In use_prod mode, (bits-1) go to MSE stage and 1 bit to QJL. bits >= 3 is required when use_prod=True.

4
seed int

Rotation seed — must be the same at index build and query time.

0
use_prod bool

Enable TurboQuant_prod unbiased inner-product estimator. Slower (~2x) but zero systematic bias.

False
chunk_size int | None

If set, search processes chunk_size rows at a time without materialising the full float16 cache. Use for N > 500k to trade compute for memory. None (default) uses the cached matmul.

None
normalized bool

If True, input vectors are assumed to already be unit-length. Skips norm computation in add/add_batch.

False

Examples:

>>> import numpy as np
>>> from snapvec import SnapIndex
>>> idx = SnapIndex(dim=384, bits=4)
>>> vecs = np.random.randn(1000, 384).astype(np.float32)
>>> idx.add_batch(list(range(1000)), vecs)
>>> results = idx.search(vecs[0], k=5)
>>> results[0][0]  # top match is the vector itself
0

freeze

freeze() -> None

Freeze + pre-warm the lazy centroid cache.

_search_cached materialises self._cache on first call (see the non-chunked default path). If the caller froze the index without doing a warm-up query, two concurrent searches would race on that assignment — breaking the thread-safety contract FreezableIndex documents. We pre-warm here so every post-freeze search() only reads the cache.

Chunked mode (chunk_size is not None) never touches self._cache and the filtered-subset path builds per-query work into local arrays, so those paths are already safe without pre-warming.

add

add(id: Any, vector: NDArray[float32]) -> None

Add a single vector. Prefer :meth:add_batch for bulk inserts.

add_batch

add_batch(ids: list[Any], vectors: NDArray[float32]) -> None

Add vectors in bulk — ~50x faster than repeated :meth:add.

Parameters:

Name Type Description Default
ids list[Any]

Identifier for each vector (int, str, …). Must be unique.

required
vectors (NDArray[float32], shape(n, dim))

Raw embedding vectors (need not be normalized).

required

delete

delete(id: Any) -> bool

Remove a vector by id. O(1) lookup, O(1) delete via swap-with-last.

Returns True if the id was found and removed, False otherwise.

search

search(query: NDArray[float32], k: int = 10, filter_ids: set[Any] | None = None) -> list[tuple[Any, float]]

Find k nearest neighbors by approximate cosine similarity.

Parameters:

Name Type Description Default
query (NDArray[float32], shape(dim))

Query vector (raw, need not be normalized).

required
k int

Number of results to return.

10
filter_ids set | None

If provided, restrict search to this subset of ids. Uses O(1) dict lookups — cost is O(|filter_ids| · d) instead of O(N · d). Useful for collection/partition filtering.

None

Returns:

Type Description
list of (id, score) sorted by descending similarity.
Score is an approximation of cosine similarity in [-1, 1].
Notes

When chunk_size is set, search processes rows in chunks without materialising the full float16 cache. This trades peak RAM for additional compute — useful when N > ~500k vectors.

save

save(path: str | Path) -> None

Persist index to a binary .snpv file (atomic write).

Writes to <path>.tmp first, then renames atomically. Indices are bit-packed on disk (b/8 bytes per coordinate). An 8-byte CRC32 trailer is appended so silent disk / transport corruption is caught at load() time.

load classmethod

load(path: str | Path) -> 'SnapIndex'

Load index from a .snpv file.

Supports v1 (mse-only legacy) and v2 (prod/flags) formats. Verifies the CRC32 trailer when present (files saved with snapvec ≥ 0.7); legacy files without a trailer load without integrity checking for backward compat.

stats

stats() -> dict[str, Any]

Return memory and compression statistics.