IVFPQSnapIndex¶
IVFPQSnapIndex ¶
IVFPQSnapIndex(dim: int, nlist: int, M: int, K: int = 256, seed: int = 0, normalized: bool = False, use_rht: bool = False, keep_full_precision: bool = False, use_opq: bool = False)
Bases: FreezableIndex
Inverted-file + residual Product Quantization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dim
|
int
|
Embedding dimension. |
required |
nlist
|
int
|
Number of coarse clusters. Typical values: |
required |
M
|
int
|
Number of PQ subspaces. Must divide |
required |
K
|
int
|
Centroids per subspace. |
256
|
seed
|
int
|
|
0
|
normalized
|
bool
|
When True, inputs are assumed unit-length and no per-vector norm is stored. |
False
|
use_rht
|
bool
|
Off by default — same rationale as |
False
|
use_opq
|
bool
|
When True, learn an orthogonal OPQ-P rotation (Ge et al.,
2013) during |
False
|
close ¶
Release the lazy thread pool, if one was created.
Safe to call multiple times. Useful when an index is being torn down explicitly (e.g., long-lived workers cycling indices) — Python's GC will also reclaim the executor when the index goes out of scope, but explicit cleanup avoids worker threads lingering past their last useful query.
fit ¶
Train coarse centroids and residual codebooks.
add_batch ¶
Append a batch. Re-sorts the whole corpus by cluster id to preserve the contiguous layout — O(N) per call, so bulk-ingest before search is the intended pattern.
Encoding is chunked so peak transient memory stays bounded
regardless of batch size. _id_to_row is rebuilt once at
the end via numpy bulk operations rather than a Python dict
comprehension, which avoids a O(N) interpreter pass at the
end of large batches.
search ¶
search(query: NDArray[float32], k: int = 10, nprobe: int | None = None, rerank_candidates: int | None = None, filter_ids: set[Any] | None = None) -> list[tuple[Any, float]]
Approximate top-k via IVF probing + residual PQ ADC.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
NDArray[float32]
|
|
required |
k
|
int
|
|
10
|
nprobe
|
int | None
|
Number of coarse clusters to visit. |
``max(1, nlist // 16)``
|
rerank_candidates
|
int | None
|
When set, the IVF-PQ pass returns the top- Requires |
None
|
filter_ids
|
set | None
|
When provided, restrict results to ids in the set. The implementation is cluster- and pool-aware: the probe ranking is restricted to clusters that contain at least one filter row (so sparse filters skip clusters entirely), and the row-level mask is applied before the top-k / rerank-pool selection (so the rerank candidate pool is drawn from the filtered subset, not from the unfiltered probe output). Unknown ids in |
None
|
search_batch ¶
search_batch(queries: NDArray[float32], k: int = 10, nprobe: int | None = None, num_threads: int = 1, filter_ids: set[Any] | None = None) -> list[list[tuple[Any, float]]]
Approximate top-k for a batch of queries.
Throughput-oriented sibling of search(). Two things move:
- Coarse probe + LUT build run as one BLAS call each across
the whole batch instead of B per-query matmuls. At
B = 128, M = 192, K = 256 this alone is ~5× faster than
looping
search(). - Per-query gather + scoring can optionally fan out over
num_threadsworker threads. Unlike single-query threading (which competes with NumPy's internal BLAS pool), batch-level threading hands each thread a whole query's worth of work, so Python overhead is amortised over the query and the speedup actually shows up.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
queries
|
NDArray[float32]
|
Shape |
required |
k
|
as in ``search()``.
|
|
10
|
nprobe
|
as in ``search()``.
|
|
10
|
num_threads
|
int
|
Worker threads for per-query scoring. |
1
|
filter_ids
|
set | None
|
Optional id whitelist shared by every query in the batch
(typical use: tenant / partition scoping). Same cluster-
and pool-aware semantics as |
None
|
Returns:
| Type | Description |
|---|---|
list of length B; each entry is the per-query top-k list of
|
|
``(id, score)`` pairs (same shape as ``search()``). Queries
|
|
with zero norm return ``[]`` for that slot.
|
|