Storage Model
TaiDB stores records in an append-only log. Writes add new records to the end of the file. Reads use an in-memory index to find the latest record for a key and then load the value from disk, using memory-mapped reads by default.
New databases start with a small file header that identifies the TaiDB storage
format version and feature flags. Headerless 0.1.x files still open, and
compaction or repack rewrites live records into the current headered format.
Record lifecycle
Section titled “Record lifecycle”- A write appends a record.
- The in-memory index points the key to the newest record.
- A delete appends a tombstone.
- Compaction rewrites only live records into a smaller file.
This model keeps writes straightforward and makes crash recovery inspectable. On open, TaiDB scans the file and rebuilds the in-memory index from the log.
Keys and values
Section titled “Keys and values”Keys and values are byte-oriented at the low level. The high-level
TaiDbEngine exposes string-friendly helpers:
put_text()get_text()put_bytes()get_bytes()
Vector records are stored separately from regular value records so vector search can operate on vector payloads without decoding unrelated values.
Memory-mapped reads
Section titled “Memory-mapped reads”Memory-mapped reads are enabled by default. They reduce copying for read-heavy local workloads and work well for single-file embedded storage.
Disable mmap reads when you need direct file reads:
let db = taidb::EngineConfig::new("./app.taidb") .mmap_reads(false) .open()?;Or from the CLI:
taidb --no-mmap-reads get ./app.taidb user:1Compaction
Section titled “Compaction”Updates and deletes leave stale records in the log. Compaction rewrites live records into a compacted file.
taidb compact ./app.taidbUse compaction when:
file_bytesis much larger thanlive_bytes.- You deleted many keys.
- You updated many keys repeatedly.
- You want a smaller snapshot or backup.
Before 1.0.0, compaction is an important area for hardening. The roadmap calls
for interruption-safe compaction guarantees and documented space amplification
behavior.
Crash recovery
Section titled “Crash recovery”TaiDB has tests for partial-tail recovery and corruption detection. Writable opens truncate recoverable partial trailing records. Read-only opens report a deterministic corruption error when recovery would require rewriting the file. Before relying on TaiDB for production data, read the roadmap items for batch atomicity and durability modes.
The 1.0.0 target is explicit:
- deterministic corruption errors
- format versioning
- migration tests
- documented crash behavior
- clear durability guarantees