FAISS Python API: IndexFlatL2, IndexIVFFlat, NumPy float32, add, search, remove

This guide is a practical FAISS Python API reference: you build NumPy float32 embedding matrices, pick an index type, add / search, optionally attach IDs and remove_ids, train IVF indexes, tune nprobe, and save or load indexes on disk. It is not a generic vector-database survey; it stays close to dense vectors, shapes, dtypes, and the calls that break in real projects.

FAISS (Facebook AI Similarity Search) is optimized for similarity search and clustering over dense vectors—typical inputs are embedding rows from models (image/text/audio). Think vectors in, (distances, neighbor indices) out; optional custom IDs when you must map neighbors back to rows in your database. For ML taxonomy context, see types of machine learning; for NumPy matrix shaping before indexing, see combine two column matrices in Python.

Tested on: Python 3.10+ with NumPy and faiss-cpu (PyPI); Linux x86_64.

What is FAISS in Python?

FAISS stores d-dimensional vectors and answers k-nearest-neighbor queries under a metric (L2 or inner product). The Python API wraps the C++ core: you work with numpy.ndarray objects, call faiss.index_factory or constructors such as IndexFlatL2, IndexIVFFlat, then add, train (when required), and search. For RAG or semantic search, embeddings from a model become rows of a matrix (num_vectors, dim); FAISS does not run the model—it only searches vectors you pass in.

Install FAISS for Python

For CPU-only machines, install faiss-cpu from PyPI (this is the supported CPU wheel name; the bare faiss package name on PyPI is not the one you want for routine installs):

bash

pip install faiss-cpu numpy

GPU builds depend on CUDA version, driver, and platform; Conda-forge or Facebook’s build instructions often fit GPU stacks better than a one-line pip. If pip install faiss-gpu fails on your machine, treat GPU setup as a separate environment task and stay on faiss-cpu until the stack matches upstream wheels.

Quick sanity check:

bash

python -c "import faiss, numpy as np; print(faiss.__version__, np.__version__)"

FAISS Python API basic workflow

End-to-end pattern:

Build database vectors xb as float32, shape (N, d).
Create an index for dimension d (e.g. IndexFlatL2(d)).
index.add(xb) (or add_with_ids on an IndexIDMap).
Prepare queries xq as float32, shape (Q, d).
D, I = index.search(xq, k) — distances D, neighbor positions or IDs I, both shape (Q, k).

python


import numpy as np
import faiss

d = 64
nb = 1000
nq = 10
np.random.seed(0)
xb = np.random.random((nb, d)).astype("float32")
xq = np.random.random((nq, d)).astype("float32")

index = faiss.IndexFlatL2(d)
index.add(xb)
k = 4
D, I = index.search(xq, k)
assert D.shape == (nq, k) and I.shape == (nq, k)

NumPy array requirements for FAISS

This section targets IndexFlatL2 + NumPy float32 issues that show up often in search logs.

2-D only: xb.shape == (N, d) and xq.shape == (Q, d). A single query must still be (1, d), not (d,).
d matches the index: constructor faiss.IndexFlatL2(d) fixes d; add / search last dimension must equal index.d.
dtype: use numpy.float32 (astype("float32")). float64 can error or be rejected depending on build and call path.
C-contiguous: after slicing or transposing, call np.ascontiguousarray(x, dtype=np.float32) if you see errors about non-contiguous buffers.

python


import numpy as np

d = 8
bad_row = np.random.rand(d)          # shape (d,) — wrong for add
good = np.random.rand(1, d).astype("float32")  # shape (1, d) — OK for one query

Create a simple IndexFlatL2 index

IndexFlatL2 is the exact L2 (Euclidean) index: at search time it compares the query to all stored vectors. It is ideal for learning the API and for small or medium N where brute force is acceptable.

python


import faiss
d = 128
index = faiss.IndexFlatL2(d)

Add vectors to a FAISS index

index.add(xb) requires xb shape (N, d), float32, same d as the index. Typical failures: wrong dtype, 1-D array, transposed shape (d, N), or d mismatch.

python


import numpy as np
import faiss

d = 32
index = faiss.IndexFlatL2(d)
xb = np.random.random((500, d)).astype("float32")
index.add(xb)
print(index.ntotal)  # 500

Search vectors with FAISS

search(xq, k) returns D, I: for each of the Q queries, the k smallest distances (for L2) and the indices of neighbors in the index (row order for a flat index without ID mapping). k cannot exceed index.ntotal unless the index allows it (for an empty index, expect errors).

python


import numpy as np
import faiss

d = 32
index = faiss.IndexFlatL2(d)
xb = np.random.random((500, d)).astype("float32")
index.add(xb)
k = 5
D, I = index.search(xb[:3], k)
assert D.shape == (3, k)

IndexFlatL2 vs IndexFlatIP

Index	Metric	Typical use
`IndexFlatL2`	Squared L2 distance	Euclidean nearest neighbors
`IndexFlatIP`	Inner product	Maximum dot-product; for cosine similarity, L2-normalize rows to unit length then use IP (or dedicated cosine preprocessing from the wiki)

python


import faiss
d = 16
index_l2 = faiss.IndexFlatL2(d)
index_ip = faiss.IndexFlatIP(d)

Use IDs with IndexIDMap

Plain IndexFlatL2 assigns implicit sequential IDs 0 .. ntotal-1. To attach your own int64 IDs (database primary keys, chunk ids), wrap the base index with faiss.IndexIDMap and use add_with_ids.

python


import numpy as np
import faiss

d = 8
nb = 100
xb = np.random.random((nb, d)).astype("float32")
ids = (np.arange(nb) + 1000).astype("int64")  # custom IDs

base = faiss.IndexFlatL2(d)
index = faiss.IndexIDMap(base)
index.add_with_ids(xb, ids)

k = 3
D, I = index.search(xb[:2], k)  # I contains custom IDs where applicable

add_with_ids is only valid when the stack supports it (ID-mapped wrappers); otherwise use add and track the mapping yourself.

Remove vectors with remove_ids

remove_ids deletes vectors whose IDs match the selector. On large flat structures, removal can scan storage; the wiki notes removal patterns and performance depend on index family. Always check ntotal after removal to confirm.

python


import numpy as np
import faiss

d = 4
xb = np.random.random((20, d)).astype("float32")
ids = np.arange(200, 220, dtype="int64")
index = faiss.IndexIDMap(faiss.IndexFlatL2(d))
index.add_with_ids(xb, ids)
index.remove_ids(np.array([210], dtype="int64"))
print(index.ntotal)

If remove_ids appears to “do nothing,” confirm you used IndexIDMap, passed int64 IDs, and that IDs exist.

Create an IndexIVFFlat index

IndexIVFFlat is an IVF (inverted file) approximate index: vectors are partitioned into nlist lists for faster search at scale. Construction needs:

a quantizer (often IndexFlatL2(d)),
dimension d,
nlist (number of clusters / lists).

python


import numpy as np
import faiss

d = 16
nlist = 10
quantizer = faiss.IndexFlatL2(d)
index_ivf = faiss.IndexIVFFlat(quantizer, d, nlist)

Train IndexIVFFlat before adding vectors

IVF indexes must learn the partition structure. Call train(x_train) on a representative sample with the same d, usually float32, before add. Skipping train is a common runtime error.

python


import numpy as np
import faiss

d = 16
nlist = 8
xb = np.random.random((5000, d)).astype("float32")
quantizer = faiss.IndexFlatL2(d)
index = faiss.IndexIVFFlat(quantizer, d, nlist)
index.train(xb)
index.add(xb)

Tune nprobe for IVF search

At query time, index.nprobe controls how many IVF lists are visited. Higher nprobe improves recall but costs more distance work and time. Start small (for example nlist // 32 or 1) and increase until recall is acceptable on a held-out query set.

python


import numpy as np
import faiss

d, nlist = 16, 8
xb = np.random.random((2000, d)).astype("float32")
index = faiss.IndexIVFFlat(faiss.IndexFlatL2(d), d, nlist)
index.train(xb)
index.add(xb)
index.nprobe = 4
D, I = index.search(xb[:5], 10)

Save and load a FAISS index

Persist to disk with write_index / read_index (paths are local; encrypt or ACL-protect sensitive embedding stores).

python


import faiss
import tempfile
import os

d = 8
index = faiss.IndexFlatL2(d)
fd, path = tempfile.mkstemp(suffix=".index")
os.close(fd)
faiss.write_index(index, path)
loaded = faiss.read_index(path)
os.remove(path)
assert loaded.d == d

Common FAISS Python errors

Symptom	Likely cause	What to check
dtype / type error on `add`	`float64` or object dtype	`xb.astype("float32")`, `np.ascontiguousarray`
`add`: dimension mismatch	Wrong trailing dimension	`xb.shape[1] == index.d`
`search`: wrong shape	Query passed as 1-D	Reshape to `(1, d)` or `(Q, d)`
IVF assert / train error	`train` not called or too few points	Call `train` before `add`; ensure enough vectors vs `nlist`
`add_with_ids` unsupported	Base index not wrapped	Use `IndexIDMap` / supported stack
`remove_ids` surprising result	Wrong ID type or ID not present	`int64` IDs, verify membership
`pip install faiss` confusion	Wrong package name on PyPI	Prefer `faiss-cpu` for CPU wheels

python


# Before add or search (vectors shape (N, d), float32):
assert vectors.shape[1] == index.d, (vectors.shape[1], index.d)

Which FAISS index should you use?

Use case	Good starting index
Learning, exact L2, smaller `N`	`IndexFlatL2`
Cosine-like with unit vectors	Normalize rows, then `IndexFlatIP`
Larger `N`, approximate search	`IndexIVFFlat` (train + `nprobe`)
Custom vector IDs	`IndexIDMap` / `IndexIDMap2` + `add_with_ids`
Delete by custom ID	IDMap + `remove_ids` (check supported combinations)
Memory pressure at scale	PQ / IVFPQ and other compressed indexes (see wiki)

FAISS Python API cheat sheet

Task	Typical call
Install (CPU)	`pip install faiss-cpu numpy`
Flat L2 index	`faiss.IndexFlatL2(d)`
Add vectors	`index.add(xb)` with `xb` `(N,d)` `float32`
Search	`D, I = index.search(xq, k)`
Inner product	`faiss.IndexFlatIP(d)` (often with normalized vectors)
Custom IDs	`faiss.IndexIDMap(base)` + `add_with_ids`
Remove IDs	`index.remove_ids(...)`
IVF index	`faiss.IndexIVFFlat(quantizer, d, nlist)`
Train IVF	`index.train(x_train)` before `add`
IVF recall / speed	Tune `index.nprobe`
Save / load	`faiss.write_index`, `faiss.read_index`

Official references: FAISS GitHub, Wiki, Getting started.

Summary

This article positions FAISS as a Python API for dense vector search: float32 NumPy arrays shaped (n, d), IndexFlatL2 for exact L2 search, IndexFlatIP when inner product (often with normalized embeddings) matches your metric, IndexIVFFlat with train then add and nprobe for approximate IVF search, IndexIDMap with add_with_ids and remove_ids for application-level IDs, and write_index / read_index for persistence. The troubleshooting table maps common mistakes—dtype, shape, untrained IVF, and ID semantics—to quick checks. For broader ML context, see types of machine learning; for NumPy shaping habits, see NumPy column stacking.

References

Frequently Asked Questions

1. Why does FAISS expect float32 NumPy arrays?

The Python bindings map vectors to the C++ layer as contiguous float32 by default; float64 often fails or is converted unpredictably, so use .astype("float32") and C-contiguous 2-D arrays of shape (n, d).

2. Do I have to train IndexIVFFlat before add()?

Yes. IVF indexes cluster the vector space; call train() on a representative sample (same dimension as the index, typically float32) before add(), then set nprobe before search() for recall vs speed tradeoffs.

3. How do I delete vectors by ID in FAISS?

Wrap a base index in faiss.IndexIDMap (or use IndexIDMap2), add vectors with add_with_ids using int64 IDs, then call remove_ids with an IDSelector or an array of IDs to drop; not every index type supports removal the same way, so check the wiki for your stack.