FAISS Python API Tutorial: IndexFlatL2, IndexIVFFlat, add, search, and remove_ids

Practical FAISS Python API guide: install faiss-cpu, NumPy float32 shapes, IndexFlatL2 and IndexFlatIP, add and search, IndexIVFFlat training and nprobe, IndexIDMap and remove_ids, save/load indexes, common dtype and dimension errors, and an index-selection cheat sheet.

Published

Updated

Read time 8 min read

Reviewed byDeepak Prasad

FAISS Python API Tutorial: IndexFlatL2, IndexIVFFlat, add, search, and remove_ids

This guide is a practical FAISS Python API reference: you build NumPy float32 embedding matrices, pick an index type, add / search, optionally attach IDs and remove_ids, train IVF indexes, tune nprobe, and save or load indexes on disk. It is not a generic vector-database survey; it stays close to dense vectors, shapes, dtypes, and the calls that break in real projects.

FAISS (Facebook AI Similarity Search) is optimized for similarity search and clustering over dense vectors—typical inputs are embedding rows from models (image/text/audio). Think vectors in, (distances, neighbor indices) out; optional custom IDs when you must map neighbors back to rows in your database. For ML taxonomy context, see types of machine learning; for NumPy matrix shaping before indexing, see combine two column matrices in Python.

Tested on: Python 3.10+ with NumPy and faiss-cpu (PyPI); Linux x86_64.


What is FAISS in Python?

FAISS stores d-dimensional vectors and answers k-nearest-neighbor queries under a metric (L2 or inner product). The Python API wraps the C++ core: you work with numpy.ndarray objects, call faiss.index_factory or constructors such as IndexFlatL2, IndexIVFFlat, then add, train (when required), and search. For RAG or semantic search, embeddings from a model become rows of a matrix (num_vectors, dim); FAISS does not run the model—it only searches vectors you pass in.


Install FAISS for Python

For CPU-only machines, install faiss-cpu from PyPI (this is the supported CPU wheel name; the bare faiss package name on PyPI is not the one you want for routine installs):

bash
pip install faiss-cpu numpy

GPU builds depend on CUDA version, driver, and platform; Conda-forge or Facebook’s build instructions often fit GPU stacks better than a one-line pip. If pip install faiss-gpu fails on your machine, treat GPU setup as a separate environment task and stay on faiss-cpu until the stack matches upstream wheels.

Quick sanity check:

bash
python -c "import faiss, numpy as np; print(faiss.__version__, np.__version__)"

FAISS Python API basic workflow

End-to-end pattern:

  1. Build database vectors xb as float32, shape (N, d).
  2. Create an index for dimension d (e.g. IndexFlatL2(d)).
  3. index.add(xb) (or add_with_ids on an IndexIDMap).
  4. Prepare queries xq as float32, shape (Q, d).
  5. D, I = index.search(xq, k) — distances D, neighbor positions or IDs I, both shape (Q, k).
python
import numpy as np
import faiss

d = 64
nb = 1000
nq = 10
np.random.seed(0)
xb = np.random.random((nb, d)).astype("float32")
xq = np.random.random((nq, d)).astype("float32")

index = faiss.IndexFlatL2(d)
index.add(xb)
k = 4
D, I = index.search(xq, k)
assert D.shape == (nq, k) and I.shape == (nq, k)

NumPy array requirements for FAISS

This section targets IndexFlatL2 + NumPy float32 issues that show up often in search logs.

  • 2-D only: xb.shape == (N, d) and xq.shape == (Q, d). A single query must still be (1, d), not (d,).
  • d matches the index: constructor faiss.IndexFlatL2(d) fixes d; add / search last dimension must equal index.d.
  • dtype: use numpy.float32 (astype("float32")). float64 can error or be rejected depending on build and call path.
  • C-contiguous: after slicing or transposing, call np.ascontiguousarray(x, dtype=np.float32) if you see errors about non-contiguous buffers.
python
import numpy as np

d = 8
bad_row = np.random.rand(d)          # shape (d,) — wrong for add
good = np.random.rand(1, d).astype("float32")  # shape (1, d) — OK for one query

Create a simple IndexFlatL2 index

IndexFlatL2 is the exact L2 (Euclidean) index: at search time it compares the query to all stored vectors. It is ideal for learning the API and for small or medium N where brute force is acceptable.

python
import faiss
d = 128
index = faiss.IndexFlatL2(d)

Add vectors to a FAISS index

index.add(xb) requires xb shape (N, d), float32, same d as the index. Typical failures: wrong dtype, 1-D array, transposed shape (d, N), or d mismatch.

python
import numpy as np
import faiss

d = 32
index = faiss.IndexFlatL2(d)
xb = np.random.random((500, d)).astype("float32")
index.add(xb)
print(index.ntotal)  # 500

Search vectors with FAISS

search(xq, k) returns D, I: for each of the Q queries, the k smallest distances (for L2) and the indices of neighbors in the index (row order for a flat index without ID mapping). k cannot exceed index.ntotal unless the index allows it (for an empty index, expect errors).

python
import numpy as np
import faiss

d = 32
index = faiss.IndexFlatL2(d)
xb = np.random.random((500, d)).astype("float32")
index.add(xb)
k = 5
D, I = index.search(xb[:3], k)
assert D.shape == (3, k)

IndexFlatL2 vs IndexFlatIP

Index Metric Typical use
IndexFlatL2 Squared L2 distance Euclidean nearest neighbors
IndexFlatIP Inner product Maximum dot-product; for cosine similarity, L2-normalize rows to unit length then use IP (or dedicated cosine preprocessing from the wiki)
python
import faiss
d = 16
index_l2 = faiss.IndexFlatL2(d)
index_ip = faiss.IndexFlatIP(d)

Use IDs with IndexIDMap

Plain IndexFlatL2 assigns implicit sequential IDs 0 .. ntotal-1. To attach your own int64 IDs (database primary keys, chunk ids), wrap the base index with faiss.IndexIDMap and use add_with_ids.

python
import numpy as np
import faiss

d = 8
nb = 100
xb = np.random.random((nb, d)).astype("float32")
ids = (np.arange(nb) + 1000).astype("int64")  # custom IDs

base = faiss.IndexFlatL2(d)
index = faiss.IndexIDMap(base)
index.add_with_ids(xb, ids)

k = 3
D, I = index.search(xb[:2], k)  # I contains custom IDs where applicable

add_with_ids is only valid when the stack supports it (ID-mapped wrappers); otherwise use add and track the mapping yourself.


Remove vectors with remove_ids

remove_ids deletes vectors whose IDs match the selector. On large flat structures, removal can scan storage; the wiki notes removal patterns and performance depend on index family. Always check ntotal after removal to confirm.

python
import numpy as np
import faiss

d = 4
xb = np.random.random((20, d)).astype("float32")
ids = np.arange(200, 220, dtype="int64")
index = faiss.IndexIDMap(faiss.IndexFlatL2(d))
index.add_with_ids(xb, ids)
index.remove_ids(np.array([210], dtype="int64"))
print(index.ntotal)

If remove_ids appears to “do nothing,” confirm you used IndexIDMap, passed int64 IDs, and that IDs exist.


Create an IndexIVFFlat index

IndexIVFFlat is an IVF (inverted file) approximate index: vectors are partitioned into nlist lists for faster search at scale. Construction needs:

  • a quantizer (often IndexFlatL2(d)),
  • dimension d,
  • nlist (number of clusters / lists).
python
import numpy as np
import faiss

d = 16
nlist = 10
quantizer = faiss.IndexFlatL2(d)
index_ivf = faiss.IndexIVFFlat(quantizer, d, nlist)

Train IndexIVFFlat before adding vectors

IVF indexes must learn the partition structure. Call train(x_train) on a representative sample with the same d, usually float32, before add. Skipping train is a common runtime error.

python
import numpy as np
import faiss

d = 16
nlist = 8
xb = np.random.random((5000, d)).astype("float32")
quantizer = faiss.IndexFlatL2(d)
index = faiss.IndexIVFFlat(quantizer, d, nlist)
index.train(xb)
index.add(xb)

At query time, index.nprobe controls how many IVF lists are visited. Higher nprobe improves recall but costs more distance work and time. Start small (for example nlist // 32 or 1) and increase until recall is acceptable on a held-out query set.

python
import numpy as np
import faiss

d, nlist = 16, 8
xb = np.random.random((2000, d)).astype("float32")
index = faiss.IndexIVFFlat(faiss.IndexFlatL2(d), d, nlist)
index.train(xb)
index.add(xb)
index.nprobe = 4
D, I = index.search(xb[:5], 10)

Save and load a FAISS index

Persist to disk with write_index / read_index (paths are local; encrypt or ACL-protect sensitive embedding stores).

python
import faiss
import tempfile
import os

d = 8
index = faiss.IndexFlatL2(d)
fd, path = tempfile.mkstemp(suffix=".index")
os.close(fd)
faiss.write_index(index, path)
loaded = faiss.read_index(path)
os.remove(path)
assert loaded.d == d

Common FAISS Python errors

Symptom Likely cause What to check
dtype / type error on add float64 or object dtype xb.astype("float32"), np.ascontiguousarray
add: dimension mismatch Wrong trailing dimension xb.shape[1] == index.d
search: wrong shape Query passed as 1-D Reshape to (1, d) or (Q, d)
IVF assert / train error train not called or too few points Call train before add; ensure enough vectors vs nlist
add_with_ids unsupported Base index not wrapped Use IndexIDMap / supported stack
remove_ids surprising result Wrong ID type or ID not present int64 IDs, verify membership
pip install faiss confusion Wrong package name on PyPI Prefer faiss-cpu for CPU wheels
python
# Before add or search (vectors shape (N, d), float32):
assert vectors.shape[1] == index.d, (vectors.shape[1], index.d)

Which FAISS index should you use?

Use case Good starting index
Learning, exact L2, smaller N IndexFlatL2
Cosine-like with unit vectors Normalize rows, then IndexFlatIP
Larger N, approximate search IndexIVFFlat (train + nprobe)
Custom vector IDs IndexIDMap / IndexIDMap2 + add_with_ids
Delete by custom ID IDMap + remove_ids (check supported combinations)
Memory pressure at scale PQ / IVFPQ and other compressed indexes (see wiki)

FAISS Python API cheat sheet

Task Typical call
Install (CPU) pip install faiss-cpu numpy
Flat L2 index faiss.IndexFlatL2(d)
Add vectors index.add(xb) with xb (N,d) float32
Search D, I = index.search(xq, k)
Inner product faiss.IndexFlatIP(d) (often with normalized vectors)
Custom IDs faiss.IndexIDMap(base) + add_with_ids
Remove IDs index.remove_ids(...)
IVF index faiss.IndexIVFFlat(quantizer, d, nlist)
Train IVF index.train(x_train) before add
IVF recall / speed Tune index.nprobe
Save / load faiss.write_index, faiss.read_index

Official references: FAISS GitHub, Wiki, Getting started.


Summary

This article positions FAISS as a Python API for dense vector search: float32 NumPy arrays shaped (n, d), IndexFlatL2 for exact L2 search, IndexFlatIP when inner product (often with normalized embeddings) matches your metric, IndexIVFFlat with train then add and nprobe for approximate IVF search, IndexIDMap with add_with_ids and remove_ids for application-level IDs, and write_index / read_index for persistence. The troubleshooting table maps common mistakes—dtype, shape, untrained IVF, and ID semantics—to quick checks. For broader ML context, see types of machine learning; for NumPy shaping habits, see NumPy column stacking.


References


Frequently Asked Questions

1. Why does FAISS expect float32 NumPy arrays?

The Python bindings map vectors to the C++ layer as contiguous float32 by default; float64 often fails or is converted unpredictably, so use .astype("float32") and C-contiguous 2-D arrays of shape (n, d).

2. Do I have to train IndexIVFFlat before add()?

Yes. IVF indexes cluster the vector space; call train() on a representative sample (same dimension as the index, typically float32) before add(), then set nprobe before search() for recall vs speed tradeoffs.

3. How do I delete vectors by ID in FAISS?

Wrap a base index in faiss.IndexIDMap (or use IndexIDMap2), add vectors with add_with_ids using int64 IDs, then call remove_ids with an IDSelector or an array of IDs to drop; not every index type supports removal the same way, so check the wiki for your stack.
Deepak Prasad

R&D Engineer

Founder of GoLinuxCloud with more than 15 years of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive …