Expand description
Content-hash boundary for bibliography inputs (manifest §7, §32; design
note docs/incremental-dependencies.md §4.1).
A future incremental build needs to answer “did this .bib change?” before
it can decide whether cached citation data is stale.
bibliography_content_hash supplies the boundary half of that answer:
it pins exactly which bytes feed a bibliography source’s content hash, so
two builds of the same file converge on the same [ContentHash] and any
edit diverges. The path-shaped identity half lives next door in
mos_cache::DependencyId::Bibliography, and the two are paired by
mos_cache::BibliographyDependency.
§Hash boundary (design note §4.1)
BibliographyContentHash = H(
engine_version, // stamped by ContentHasher::new()
domain_tag, // distinguishes this boundary from other H(...)
file_bytes // raw bytes as read, byte-for-byte, no normalization
)The bytes are hashed raw: no NFC, no line-ending fold, no BOM strip. That mirrors §4.1; the parser does not normalize source today, so the content hash must reflect what the parser actually consumed, or the cache would “forget” cosmetic edits the parser is sensitive to. Filesystem-derived data (mtime, inode, absolute path) is deliberately not an input.
H is [mos_core::ContentHasher]; the shared, engine-version-stamped,
length-framed FNV-1a-128 boundary hasher (interim; swappable to BLAKE3 per
§9.4 without changing this &[u8] -> ContentHash signature). This boundary
just supplies the domain tag and the raw bytes.
Constants§
- DOMAIN_
TAG 🔒 - Domain separator: keeps this boundary’s hashes from colliding with any other
H(...)boundary that happens to feed identical bytes. The trailing/v1versions the framing, independently ofengine_version.
Functions§
- bibliography_
content_ hash - Compute the content-hash boundary for one bibliography source’s raw bytes.