ADR 0003: OPQ-P over OPQ-NP¶

Date: 2026-04-21 Status: accepted

Context¶

OPQ (Ge et al., 2013) learns an orthogonal rotation that balances per-subspace variance before PQ encoding. Orthogonality preserves inner products, so <q, x> == <Rq, Rx>, and the rest of the search pipeline is unchanged.

The original paper proposes two variants:

OPQ-NP (non-parametric). Alternating optimisation: freeze R, refit the PQ codebooks on X R; freeze the codebooks, refit R as the orthogonal Procrustes solution. Typically 5-10 outer iterations until convergence. Best reported recall.
OPQ-P (parametric). Assume a centred Gaussian source. The optimal R is the sorted eigenvector matrix of the covariance with a round-robin subspace allocation that spreads high-variance directions across subspaces. One eigendecomposition, no outer loop. Faster to fit, slightly lower recall.

On BEIR FIQA (BGE-small, dim=384) measured by us at three M values:

M	d_sub	PQ baseline	OPQ-P	delta
48	8	0.553	0.656	+10.3pp
96	4	0.767	0.812	+4.6pp
192	2	0.932	0.931	0

Published OPQ-NP numbers on comparable setups report an additional +0.3-0.8 percentage points on top of OPQ-P, at the cost of the iterative fit loop: minutes instead of seconds at N = 100k-1M.

Decision¶

Ship OPQ-P. The flag is use_opq=True on both PQSnapIndex and IVFPQSnapIndex, defaults to False.

fit_opq_rotation does:

Centre X (per-chunk, float32 arithmetic; ADR 0005).
Accumulate (d, d) covariance in float64 chunks.
np.linalg.eigh on the (d, d) covariance.
Sort eigenvectors by eigenvalue descending.
Round-robin to M subspaces of size d / M: eigenvector i goes to subspace i % M. Spreads high-variance directions across subspaces.
Return the stacked column matrix as a (d, d) float32 rotation.

Consequences¶

fit() stays in the seconds-range for the sizes we target (10k-1M training rows). A library positioned as "laptop-local ANN with predictable latency" would be poorly served by an iterative outer loop that turns fit into a multi-minute job.
We leave 0.3-0.8 pp of recall on the table at configurations where OPQ already helps. On configurations where OPQ-P gives zero gain (d_sub < 4), OPQ-NP also gives zero gain; neither variant has structure to exploit.
The parametric assumption (centred Gaussian source) is violated on real embeddings, but in the regime where OPQ helps at all (d_sub >= 4), OPQ-P captures most of the available gain.
OPQ-NP remains a possible future addition. It would be a separate fit function, not a flag on fit_opq_rotation, because its outer loop interleaves with PQ codebook training. If added, it supersedes neither this ADR nor the existing use_opq semantics -- it is a strictly additional option.
Anyone tempted to "just add a few iterations" to the current parametric path should know that is not what makes OPQ-NP work. The non-parametric variant also refits the PQ codebooks at each iteration; a rotation-only loop converges to the OPQ-P solution.