Expand description
Background pruner for the historical column families.
Pruning is a standalone Service rather than a framework
pipeline: it does not consume checkpoints from ingestion, it reads
the already-committed state and deletes data below a retention
floor. The shape mirrors the validator’s perpetual-store pruner
(a periodic background task) more than the indexer framework’s
per-pipeline prune hook — the deletions are data-driven (we walk
transaction effects to retract superseded object versions) and the
floor is a single value shared across every historical CF.
§What gets pruned
- Per-transaction CFs (
transactions,effects,events,tx_metadata_by_seq) — range-deleted over the prunedtx_seqrange; the keys are contiguous big-endiantx_seq, so one range tombstone per CF clears the chunk. - Per-checkpoint CFs (
checkpoint_summary,checkpoint_contents) — range-deleted over the pruned checkpoint range. - Digest reverse indexes (
tx_seq_by_digest,checkpoint_seq_by_digest) — point-deleted; their keys are digests, so we collect them from the data being pruned (tx digests from each effects row, checkpoint digests from each summary) before deleting. objectshistory — point-deleted, effects-driven: each pruned transaction’smodified_at_versions(superseded input versions) andall_tombstones(deleted / wrapped markers) are the exact(ObjectID, version)rows that are now dead. The latest live version is never an input to a pruned transaction, so it — and the greatestobject_version_by_checkpointentry that resolves to it — is preserved.object_version_by_checkpoint— retracted in lockstep withobjectshistory: the same effects-driven walk issues a per-object range delete clearing every checkpoint-pinned entry below the superseding transaction’s checkpoint, plus a point delete of the tombstone entry when the object was removed. The retained set mirrors theobjectsversions kept, so the index never points at a pruned version.- Ledger-history bitmaps (
transaction_bitmap,event_bitmap) — not deleted directly; advancing the sharedtx_seq_floorlets their compaction filters drop fully-pruned buckets. We force a compaction once the floor advances so the eviction is prompt rather than waiting for a natural sweep.
The live-set-bounded indexes (object_by_owner, object_by_type,
balance, package_versions) and the tiny epochs CF are never
pruned.
§Floor, retention, and safety
Retention is epoch-based: the retention_epochs most-recent
epochs are retained, and the target floor is the start checkpoint
of the oldest retained epoch. The floor is then clamped so it
never advances past the oldest in-memory snapshot’s checkpoint:
point and range deletes are already invisible to a snapshot
(RocksDB pins the data a live snapshot references), but the bitmap
compaction filter physically removes buckets irrespective of
snapshots, so the clamp keeps every live snapshot’s advertised
available range valid even under an aggressively small retention.
Each tick advances the floor toward that target by at most
max_checkpoints_per_tick checkpoints (in max_chunk_checkpoints
atomic chunks), so a large backlog — for example when pruning is
first enabled on an old database — drains across many ticks rather
than one long blocking pass. The floor converges to the target
over subsequent ticks.
§Ordering and crash-safety
Each chunk stages all of its deletes plus the new
PruningWatermarks row into one atomic batch, commits, and only
then advances the in-memory bitmap floor. Because the watermark
row lives in the same batch as the deletes, a crash either loses
the whole chunk (re-pruned next run) or commits it wholesale;
there is no partial-delete-without-watermark state. Range and
point deletes are idempotent, so a re-run is harmless.
Structs§
- Pruner
Metrics - Prometheus metrics for the pruner.
Functions§
- prune_
history_ cohort - Prune the embedded fullnode’s history cohort up to a floor supplied by the validator’s perpetual-store pruner.
- start_
pruner - Start the background pruner as a
Service.