Skip to main content

Module content

Module content 

Source
Expand description

Content-hash boundary for bibliography inputs (manifest §7, §32; design note docs/incremental-dependencies.md §4.1).

A future incremental build needs to answer “did this .bib change?” before it can decide whether cached citation data is stale. bibliography_content_hash supplies the boundary half of that answer: it pins exactly which bytes feed a bibliography source’s content hash, so two builds of the same file converge on the same [ContentHash] and any edit diverges. The path-shaped identity half lives next door in mos_cache::DependencyId::Bibliography, and the two are paired by mos_cache::BibliographyDependency.

§Hash boundary (design note §4.1)

BibliographyContentHash = H(
    engine_version,               // stamped by ContentHasher::new()
    domain_tag,                   // distinguishes this boundary from other H(...)
    file_bytes                    // raw bytes as read, byte-for-byte, no normalization
)

The bytes are hashed raw: no NFC, no line-ending fold, no BOM strip. That mirrors §4.1; the parser does not normalize source today, so the content hash must reflect what the parser actually consumed, or the cache would “forget” cosmetic edits the parser is sensitive to. Filesystem-derived data (mtime, inode, absolute path) is deliberately not an input.

H is [mos_core::ContentHasher]; the shared, engine-version-stamped, length-framed FNV-1a-128 boundary hasher (interim; swappable to BLAKE3 per §9.4 without changing this &[u8] -> ContentHash signature). This boundary just supplies the domain tag and the raw bytes.

Constants§

DOMAIN_TAG 🔒
Domain separator: keeps this boundary’s hashes from colliding with any other H(...) boundary that happens to feed identical bytes. The trailing /v1 versions the framing, independently of engine_version.

Functions§

bibliography_content_hash
Compute the content-hash boundary for one bibliography source’s raw bytes.