Skip to main content

Module parser

Module parser 

Source
Expand description

Hand-rolled recursive-descent parser for the minimal BibTeX subset.

The grammar is intentionally tiny:

bibtex := ws* (entry ws*)*
entry  := '@' type '{' key (',' fields)? '}'
fields := field (',' field)* ','?
field  := name '=' value
value  := '{' .. '}' | '"' .. '"' | bare

Entry types and field names are lowercased; citation keys are kept verbatim. Brace values balance nested {} by naive counting, so {The {LaTeX} Companion} is captured whole, but their contents are stored as raw text; no TeX decoding, no @string / @preamble macro expansion, no # concatenation, no name parsing.

Structsยง

ParsedEntry ๐Ÿ”’
ParsedKey ๐Ÿ”’
Parser ๐Ÿ”’
A byte cursor over the BibTeX source. All structural delimiters (@ { } " , =) and whitespace are ASCII, so scanning byte-by-byte never splits a multi-byte UTF-8 sequence and every recorded offset lands on a char boundary.

Functionsยง

is_bare_value_byte ๐Ÿ”’
Bytes allowed in a bare (unquoted, unbraced) value.
is_identifier_byte ๐Ÿ”’
Bytes allowed in an entry type or field name.
is_key_byte ๐Ÿ”’
Bytes allowed in a citation key: anything but a structural delimiter or whitespace.
parse_bibtex
Parse input as a minimal BibTeX database.