Hyphae · preprint published · DOI 10.5281/zenodo.20436643

Verifiable provenance
for grounded retrieval.

A provenance layer, not a brain. Hyphae answers by emitting byte-identical quotations of stored fragments over a SHA-256 hash-chained journal — so every span is auditable back to a named, unaltered source. An external Ed25519 anchor closes the tampering gap. No large language model sits in the cognition path.

Lang. Rust · CPU only · single binary
Lic. Apache-2.0 (code) · CC-BY-4.0 (docs)
Lead Mario Gutiérrez · Celiums AI
Status preprint + code public
Hyphae is
  • A verifiable provenance layer for grounded retrieval.
  • Byte-identical quotation — every answer span is a verbatim copy of a stored fragment.
  • Independently auditable: each span traces back to a named, unaltered source.
  • Tamper-evident over a SHA-256 hash chain, with an external Ed25519 head anchor.
  • Realizer-independent — an addable layer any extractive retriever can adopt.
Hyphae is not
  • A better answer-quality engine than LLMs — a trivial echo baseline ties it.
  • A claim that hash chaining is novel; the chain is classical (Haber–Stornetta, Merkle, CT).
  • "Cognition without transformers" as a value proposition.
  • An LLM replacement for open-ended generation.
  • Dependent on GPU clusters, accelerators or inference-time hardware.
01 — The problem & the honest core

Paraphrase breaks the binding to its source.

When a language model paraphrases a retrieved passage, the output can no longer be bound to its source byte-for-byte. The wording drifts; the citation becomes a gesture, not a guarantee. For grounded retrieval — where the whole value is that an answer is traceable to a real, unaltered source — that drift is the failure mode.

Hyphae's answer is deliberately narrow: emit the source verbatim, write it over a hash-chained journal, and let anyone audit the binding independently. The hash chain itself is classical — Haber–Stornetta (1991), Merkle (1988), Certificate Transparency, git. The contribution is the application to grounded retrieval, and the measurement.

The echo control · the paper's spine

A trivial echo baseline — a few lines that just print the retrieved fragment back — ties Hyphae on every correctness and grounding metric. So does echo + journal. That is the point, not a weakness: correctness and grounding are properties of verbatim quotation, not of any one system. Provenance is an addable layer any extractive retriever can adopt. Hyphae is a reference realization of it.

A paraphrase can't be bound to its source byte-for-byte. A verbatim quotation can — and a hash chain makes that binding independently auditable. Hash-Chained Verbatim Quotation · the contribution in one line
02 — The provenance layer

Five mechanisms, each independently falsifiable.

Verifiable provenance is built from five layered mechanisms. Each can be tested on its own, and each can fail without invalidating the others. Together they make every answer span auditable back to a named, unaltered source — and they hold against an attacker who knows the chain is there, rewrites it, rolls it back, withholds entries, or steals a retired key.

Mechanism 01Emission
Byte-identical quotation.

Every answer span is a verbatim copy of a stored fragment — no paraphrase, no rewording. The output is bindable to its source byte-for-byte, which is exactly what an LLM's paraphrastic generation destroys.

Falsifiable — any non-verbatim span breaks it
Mechanism 02Journal
SHA-256 hash-chained log.

Fragments and emissions are written to an append-only journal where each entry commits to the previous entry's hash. Store-only tampering across ten modes is detected and localized. The chain is classical; the application is the point.

Falsifiable — a missed store-only edit breaks it
Mechanism 03Anchor
External Ed25519 head anchor.

A bare chain falls to a chain-aware adversary who rewrites history and re-hashes. An Ed25519 signature over the chain head — key held outside the store process — closes that gap and is caught when the rewrite is attempted.

Falsifiable — an undetected chain rewrite breaks it
Mechanism 04Ledger + witness
Append-only, witnessed.

Heads are published to an append-only, hash-chained ledger — so a rollback replaying a stale anchor is rejected (freshness) and forked histories are caught (non-equivocation). An independent witness of the ledger tail catches a store that withholds entries.

Falsifiable — an accepted rollback or withheld entry breaks it
Mechanism 05Keyring
Signed key rotation.

A keyring rotates the anchor key — each successor authorized by its predecessor from a root trusted out-of-band. A ledger spanning rotations verifies under the per-epoch key, and a retired key can sign no new history, so key compromise is recoverable, not fatal.

Falsifiable — a forged successor or retired-key forgery breaks it
03 — Architecture (the substrate)

The retrieval & runtime substrate.

Hyphae runs on a coordinated set of subsystems modeled on mammalian brain regions, communicating through typed pathways. They handle retrieval, state and runtime coordination — fetching candidate fragments, tracking conversation, managing the journal.

They are not what makes a quoted answer correct — verbatim emission is. The echo control proves it: a few lines that print the retrieved fragment back score the same. Read this section as the substrate that retrieves and persists, not as the source of the answer's quality.

S01 · Thalamus
Input gating

First filter on incoming signal. Decides what reaches the substrate.

S02 · Hippocampus
Episodic memory

Stores fragments and runs pattern completion at initial retrieval.

S03 · Entorhinal Cortex
Binding & conductivity

Binds fragments and computes the conductivity weights on the retrieval network.

S04 · Amygdala
Valence

Assigns affective valence to fragments at write time and on recall.

S05 · Locus Coeruleus
Precision modulation

Modulates precision and gates consolidation alongside the BNST.

S06 · Frontal Cortex
Working set

Bounded working memory (7 fragments). Controlled inhibitory filtering of retrieval results.

S07 · ACC
Conflict & metacognition

Detects conflict, tracks threads, open questions and conversation chronology.

S08 · Cerebellum
Cascade activation

Predictive coding and spreading activation through the retrieval network.

S09 · Mammillary
Temporal indexing

Indexes fragments and threads on the temporal axis.

S10 · BNST
Consolidation gating

Decides which fragments graduate from working memory to long-term storage.

S11 · Olfactory Bulb
Salience gating

Salience model for input streams. Future foundation for multimodal extension.

S12 · Dopaminergic Midbrain
Curiosity firing

Decides when to fire a curiosity operation against external grounding.

S13 · Insula
Interoception

Tracks substrate-internal signals — load, latency, integrity, drift.

S14 · Striatum
Action selection

Selects among candidate emissions and tool invocations under reward.

S15 · Lateral PFC
Schema selection

Picks the emission schema (DialogueReply, GroundedAssertion, etc.) for the intent at hand.

S16 · Realizer
Verbatim emission

Byte-identical fragment quotation + minimal connective tissue. Slot binding with depth-two backtracking.

S17 · Journal
Audit chain

SHA-256 hash chain over every significant event, with external Ed25519 head anchor. Verified during Recovery state.

04 — What makes it verifiable

Properties, not optimizations.

The properties below are what make a Hyphae answer independently auditable. They are claims about verifiability, not about answer quality — the echo control already settled the quality question.

P01
Byte-level bindability.

Every answer span is a byte-identical copy of a stored fragment, so the output binds to its source exactly. This is the property paraphrastic LLM generation destroys — once wording drifts, no span can be matched to a source byte-for-byte.

P02
Independent auditability.

Each emitted span carries provenance back to a named, unaltered source fragment. A third party can verify the binding without trusting the system — the audit does not depend on Hyphae attesting to its own honesty.

P03
Tamper-evidence (store-only).

The SHA-256 hash chain over the journal detects and localizes store-only tampering across the benchmark's ten modes — edit, delete, insert, reorder, bit-flip, truncate, duplicate, timestamp-skew, rollback, batch. The chain is classical (Haber–Stornetta, Merkle, Certificate Transparency); the contribution is applying it to grounded retrieval and measuring it.

P04
Tamper-evidence (chain-aware).

A bare chain falls to an adversary who rewrites history and re-hashes the whole chain. An external Ed25519 signature over the head — key held outside the store process — closes that gap and catches the rewrite.

P05
Defense in depth, witnessed.

Above the bare chain: an append-only Ed25519 ledger of signed heads gives freshness (a rolled-back head with a stale anchor is rejected) and non-equivocation; an independent witness of the ledger tail catches a store that withholds entries; and a signed keyring rotates the anchor key so compromise is recoverable, not fatal. Each layer is measured.

P06
No LLM in the path · general hardware.

The cognition path is deterministic over structured inputs — no model inference between query and answer. Rust, CPU-only, single binary; fits on commodity hardware. LLMs appear only as external comparators in the evaluation, never in the emission path.

05 — Current state

Published, and measured end to end.

The preprint is public and citable (Zenodo DOI). The provenance stack is built and verified in the open: a hash-chained journal, an external Ed25519 head anchor, an append-only ledger (freshness + non-equivocation), an external witness (against entry withholding), and a signed keyring (key rotation). A community-scale provenance benchmark measures the whole stack, including a defense-escalation experiment where each attack is caught by exactly the next layer up. All in the public repository, CI-gated.

05 — What the evaluation found

The echo control, in the table.

The evaluation spans 255 queries, twelve metrics and 18 LLM-based configurations — six models × three retrieval modes — across two corpora, plus a dedicated tamper-detection benchmark. The headline finding is below, and it is deliberately not a win: on correctness and grounding, a trivial echo baseline matches Hyphae. The metric that actually separates systems is provenance.

Scope · queries255
Two corpora

255 queries evaluated across two distinct corpora, twelve metrics each, with explicit failure thresholds per metric.

Scope · LLM configs18
6 models × 3 modes

Eighteen LLM-based comparator configurations: six models, three retrieval modes each. LLMs appear only as comparators — never in Hyphae's emission path.

Scope · controlsecho
Echo & echo + journal

Two trivial baselines that print the retrieved fragment back. They tie Hyphae on correctness and grounding — that is the contribution, stated honestly.

Scope · tampering10 × 3
Ten modes × three adversaries

The provenance benchmark (provbench v2) crosses ten tampering modes with three adversary profiles, plus a defense-escalation experiment: each attack — in-place edit, chain-aware rewrite, rollback-with-stale-anchor, withholding — is caught by exactly the next layer up (chain → anchor → ledger → witness).

Metric family
Hyphae
Echo
Echo + journal
Correctness
tie
tie
tie
Grounding
tie
tie
tie
Byte-level bindability
yes
yes
yes
Tamper-evidence
full
none
store-only

Correctness and grounding are properties of verbatim quotation — every quoting system ties. Provenance (the chain + the external anchor) is what Hyphae adds on top.

06 — Open questions

What the preprint does not settle.

The honest position includes its own limitations. These are the genuine open questions carried in the paper's future-work section — stated plainly rather than buried.

OPEN · 01
Generalization beyond extractive tasks.

Verbatim quotation fits questions whose answer is a contiguous span in some source; it cannot synthesize across sources. We have begun mapping this boundary with a multi-hop harness: a single-span system silently fails on multi-hop by default, and graceful degradation (abstaining) is achievable but requires an explicit abstention signal the realizer must implement. The live multi-hop column against an LLM comparator is the next measurement.

OPEN · 02
Source-ingestion boundary.

From the journal onward the threat model is now closed: the chain catches store-only edits, the anchor the chain-aware rewrite, the ledger adds freshness and non-equivocation, a witness catches withholding, and a signed keyring makes key compromise recoverable — measured across ten modes and three adversaries. What remains genuinely open is the ingestion boundary: attesting that fragments entered the journal faithfully in the first place. Provenance from the journal forward is closed; provenance into it is not.

OPEN · 03
arXiv endorsement pending.

The work is public now on Zenodo with a citable DOI. An arXiv version is forthcoming, pending category endorsement — we will link it here when it lands. Until then, the Zenodo record and the public repository are the authoritative artifacts.

07 — Deployment

One binary, three runtimes.

Hyphae compiles to a single Rust binary. The deployment configuration is what differentiates a portable offline-only runtime from a learning runtime that absorbs new fragments through curiosity — and from an edge variant tuned for constrained hardware.

Configuration · 01
Portable binary.

Operates entirely on the fragment store loaded at startup. Curiosity disabled, all budgets capped to zero, no Vertex credentials, no network access.

  • Use casesoffline operation, sensitive contexts, compliance-restricted environments
  • Networkdisabled
  • Curiosityoff · learning off
  • RAM1–2 GB
Configuration · 02
Learning binary.

Same code, different runtime configuration. Curiosity continuously active, Vertex grounding online, learning loop refining weights from feedback, web agency enabled.

  • Use casesproject lead's personal cognitive infrastructure · VPS dogfooding · multi-user
  • GroundingVertex AI · gemini-3.5-flash + Playwright browser
  • Learningon · auditable weight updates
  • RAM4–16 GB
Configuration · 03
Edge / IoT variant.

Minimal substrate for embedded devices and companion modes. Reduced subsystems, pre-trained fragment store, read-only runtime — the smallest shape of Hyphae that still preserves the substrate's contract.

  • Use casesembedded devices · companion mode · constrained hardware
  • Moderead-only · pre-trained store
  • Curiosityoff · learning off
  • RAM256 MB – 1 GB
08 — Validation & failure criteria

Twelve metrics. Explicit thresholds for failure.

The evaluation specifies twelve typed metrics with explicit floors — the points at which a provenance claim is empirically falsified. The provenance benchmark (provbench v2) runs ten tampering modes against three adversaries, plus a defense-escalation experiment across the full stack — chain → anchor → ledger → witness.

Failure is information, not termination. If a threshold trips, it is reported honestly — the echo control is itself an example of a result that contradicted the original framing and was published rather than hidden.

verbatim fidelity
any non-byte-identical span → emission falsified.

If an emitted span is not a byte-identical copy of its source fragment, the core guarantee — bindability to source — is broken.

provenance coverage
below 1.0 → an unattributed span shipped.

Every emitted span must resolve to a named source fragment. A single span without provenance is a coverage failure.

store-only tamper detection
below 1.0 across 4 modes → chain insufficient.

Edit, delete, insert and reorder against a store-only adversary must each be detected and localized by the hash chain.

chain-aware tamper detection
a missed rewrite → external anchor insufficient.

A chain-aware adversary who re-hashes the whole journal must be caught by the external Ed25519 head anchor; a miss falsifies the anchor.

echo parity
a quality gap vs echo → over-claim risk.

If Hyphae claimed to beat echo on correctness or grounding, that would be the over-claim the paper retracted. Parity with echo is the expected, honest result.

determinism
non-reproducible emission → audit invalid.

The cognition path must be deterministic over its inputs; a non-reproducible emission undermines the auditability the whole layer depends on.

09 — Governance & method

Multi-model triangulation. Trunk-based. Open by license.

Architectural decisions and pattern-establishing implementations receive review from deepseek-v4-pro and gemini-3.5-flash before commitment. The triangulation has caught real defects pre-merge. Tests required for every PR; cargo fmt, clippy as errors, build & test for the workspace must pass before merge.

The honesty discipline is enforced structurally: the hash-chained journal at the data layer, provenance metadata on every fragment, and a published echo control that contradicted the project's own earlier framing — kept in the paper rather than cut.

Open source under Apache 2.0 for code and CC-BY-4.0 for docs, corpora and the preprint. Code, the LLM+RAG comparator, every result envelope, and the tamper-detection experiment are public — a provenance claim that can't be independently re-run isn't a provenance claim.

10 — Architecture flow

How a query becomes an audited answer.

The retrieval-and-emission path, end to end: input gating, fragment retrieval, working-set assembly, verbatim emission, and the journal write that makes the result auditable. No model inference sits between the query and the answer.

Runtime topology · provenance path

User multilingual query + feedback
Perceptual Input Layer
in/1
Text queries
canonical input
in/2
Web navigation
Playwright agent
in/3
Vision · audio
future multimodal
S1
Thalamus
input gating · normalize · route · state-check
Initial activation · 3-way fan-out
S8b
ACC
thread tracking · topic switch · conversation state · open questions
S3
Hippocampus
saliency-weighted retrieval · pattern completion
S4
Amygdala
valence assignment · emotional context
direct_fragments[]
Cascade activation · spreading through the causal graph
S3b
Entorhinal cortex
conductivity graph · boundary metadata · cross-fragment binding
adjacency + weights
S9
Cerebellum
cascade propagation (Collins-Loftus + spreadr) · predictive coding · multi-hop with decay
activated_set[]
S8b + S4
ACC + Amygdala — coherence evaluation
conflict detection (lexical + NLI) · valence-modulated filtering
working_set[] + conflicts[]
S8
Frontal Cortex — Executive Composition Orchestrator
working memory 7±2 cap · intent classification · schema selection · composite scoring (activation + relevance + saliency + recency) · conversation-state-aware composition
Hyphae-Surface · realization, no LLM in path
reg
Schema registry
DialogueReply · GroundedAssertion · Introspective · NarrativeArc · Comparative · SyntheticSummary
trig
Honest-limitation triggers
EmptyWorkingSet · HighConfabRisk · ShallowCascade · Contradictions · SourceLanguageMismatch
bind
Slot binding · fragment quotation · connective tissue
greedy + backtrack · verbatim from source · cross-lingual framing (lexicon · 9+ rules)
out
ComposedResponse
body · prelude · quoted_fragments[] · language_framings[] · realization_trace[] · provenance_chain[]
User response + feedback signal
feedback → learning loop · see background

Synchronous path · single request · no LLM in cognition

Background · 01
Learning loop (RFC v0.4)

Feedback signal — explicit and implicit — refines the substrate's parameters, not its structure. Every weight update is journalled with rollback capability.

  • Conductivity weights — which fragments connect more strongly
  • Saliency scores — which fragments rise / fall by use
  • Schema selection thresholds
  • Cascade decay & threshold per context
  • Honest-limitation triggers per domain
Background · 02
Curiosity (depth-aware)

ACC detects a causal gap → the Dopaminergic Midbrain evaluates depth + relevance + recency → curiosity fires through one of three channels.

  • Web grounding (Vertex AI) — snippets
  • Active browsing (Playwright) — full pages
  • API integrations — structured data
  • Olfactory Bulb gates absorption (quality + dup filter)
  • New fragments encoded via Hippocampus
Background · 03
Consolidation (SHY + LC/BNST)

Triggered when LC arousal and BNST valence are both low — Hippocampus replays, episodic → semantic abstraction, SHY proportional decay, conductivity graph compaction.

  • Sharp-wave-ripple replay
  • Pattern completion / separation
  • Old-trace interleaved (anti-catastrophic-forgetting)
  • Tononi SHY depth-weighted pruning
  • Mammillary temporal index refresh
~/.hyphae/
├── journal/     fjall LSM-tree + SHA-256 hash chain (immutable history)
├── state/       redb (state machine + counters)
├── fragments/   postcard binary (sharded, conductivity-indexed)
├── lexicon/     postcard binary (multilingual, in-memory cached)
├── learning/    postcard binary (weight updates with rollback chain)
├── decisions/   ADRs in markdown (architectural decision records)
└── exports/     JSON on-demand (debugging, migration, audit)
Prop 01
No context window

Persistent memory accessed by retrieval, not by inclusion in a prompt.

Prop 02
No LLM in the cognition path

Composition is deterministic over structured inputs; LLMs only consult external grounding.

Prop 03
Persistent conversational metacognition

Threads, open questions and pending follow-ups tracked as first-class system state.

Prop 04
Explicit causal grounding

Every fragment carries provenance with an explicit confabulation_risk.

Prop 05
Cryptographic audit

Journal hash chain — tampering is detectable, recovery state verifies integrity.

Prop 06
General-purpose hardware

CPU + RAM. No GPU dependence at any tier — including edge / IoT.

Prop 07
Verbatim source quotation

Fragments preserve original content. The realizer only generates connective tissue.

Prop 08
Auditable learning

Every weight update is journalled with a rollback chain. Structure is never mutated.

Prop 09
Honest limitation

Explicit acknowledgment when working material is insufficient — typed triggers, not a generic apology.

11 — Read it. Run it. Cite it.

Read it. Run it. Cite it.

Rust substrate, the LLM+RAG comparator, every result envelope, the tamper-detection experiment, and the full preprint — all public, dual-licensed Apache-2.0 / CC-BY-4.0.

Gutiérrez, M. (2026). Hash-Chained Verbatim Quotation: A Verifiable Provenance Layer for Grounded Retrieval. Zenodo. https://doi.org/10.5281/zenodo.20436643

@misc{gutierrez2026hyphae,
  author       = {Guti{\'e}rrez, Mario},
  title        = {{Hash-Chained Verbatim Quotation: A Verifiable
                  Provenance Layer for Grounded Retrieval}},
  year         = {2026},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.20436643},
  url          = {https://doi.org/10.5281/zenodo.20436643},
  note         = {Hyphae v2. Code, corpora, result envelopes, and
                  the preprint. \url{https://github.com/terrizoaguimor/hyphae-v2}}
}
Apache-2.0 code CC-BY-4.0 docs arXiv version forthcoming