Vector Search
TaiDB supports exact cosine vector search. Exact search is intentionally simple: it scans stored vectors and ranks them by cosine similarity. This keeps results deterministic and makes it a reliable correctness baseline.
Store vectors with Rust
Section titled “Store vectors with Rust”use taidb::EngineConfig;
fn main() -> taidb::Result<()> { let mut db = EngineConfig::new("./vectors.taidb").open()?;
db.put_vector("doc:rust", &[0.2, 0.8, 0.1])?; db.put_vector("doc:ai", &[0.1, 0.9, 0.2])?;
let hits = db.search_vector(&[0.0, 1.0, 0.2], 2)?; for hit in hits { println!("{}\t{:.6}", String::from_utf8_lossy(&hit.key), hit.score); }
Ok(())}Store vectors with the CLI
Section titled “Store vectors with the CLI”taidb vector-put ./vectors.taidb doc:rust "0.2,0.8,0.1"taidb vector-put ./vectors.taidb doc:ai "0.1,0.9,0.2"taidb vector-search ./vectors.taidb "0.0,1.0,0.2" --limit 2Embedding cache pattern
Section titled “Embedding cache pattern”A common pattern is to store text metadata and vector records under related keys:
doc:123:titledoc:123:bodyvec:doc:123This keeps values and vectors easy to inspect and easy to delete by prefix when the CLI or API supports that workflow.
Exact search tradeoffs
Section titled “Exact search tradeoffs”Exact search is useful when:
- the dataset is local and moderate in size
- deterministic ranking matters
- recall must be perfect
- you want simple correctness tests before adding an approximate index
Approximate nearest-neighbor search can be faster at larger scale, but it adds index maintenance, recall tradeoffs, and more operational complexity. TaiDB’s roadmap keeps exact search as the correctness baseline and only adds approximate indexing behind a feature flag after benchmarks justify it.
Practical guidance
Section titled “Practical guidance”- Keep vector dimensions consistent per collection.
- Store enough metadata to reconstruct or audit the source embedding.
- Measure query latency with your real vector count and dimension size.
- Use batch writes when importing many embeddings.
- Re-run vector search tests after compaction and encryption changes.