Generated text in a document is only useful if the reader can verify where it came from. Source-grounded citations bind a span of text in a DOCX to the record that justified it (a case, a statute, a precedent, an internal source), in a way that survives editing, DOCX export, and Word round-trips. SuperDoc implements this with metadata anchors: hidden content controls in the body that point to a JSON payload in a custom XML part, all inside the same DOCX file. For the AI / RAG worked example, see theDocumentation Index
Fetch the complete documentation index at: https://superdoc-caio-pizzol-sd-3209-source-grounded-citations-doc.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
demos/custom-ui reference workspace. For the smallest copy-paste form of the API, see the metadata-anchors example.
Storage contract: what the file contains
A metadata anchor is two halves of a DOCX, kept in sync by a stable id. 1. The anchor. A hidden inline content control wraps the cited text in the body. Thew:tag carries the stable id; w15:appearance="hidden" keeps Word from drawing chrome around the span.
<ref id="..." encoding="json"> and JSON.parses it.
w:tag is a concrete pointer, not a fuzzy text search. The id is bound to the SDT element; if the SDT survives an edit (which it does for inline edits inside the anchored range), the link survives. If a user deletes the anchored text in the editor, the SDT goes with it.
The namespace on the <refs> element is the consumer’s payload-schema URI. Pick one you own. One namespace per payload schema version makes future migrations easier: urn:your-app:citations:v2 can coexist with v1 entries during a rollout.
Persisted payload vs render-time signals
The DOCX is the source of truth for what was cited. Your provider is the source of truth for whether the citation still stands. Split the fields accordingly.| Field | Persisted in DOCX | Render-time lookup |
|---|---|---|
citationId | Yes | |
sourceId | Yes | |
sourceType ('statute', 'case', 'precedent', …) | Yes | |
provider (which provider authoritatively backs this source) | Yes | |
displayText (human-readable source name) | Yes | |
locator (section, pin cite) | Yes | |
excerpt (quote from the source) | Yes | |
deepLink | Yes | |
confidence | Yes (optional) | |
createdAt (ISO-8601 string) | Yes (optional) | |
| Verification status (KeyCite-style signals, Shepard’s-style signals) | Yes | |
| Source freshness / last-validated-at | Yes | |
| Author profile, jurisdiction notes | Yes |
Supported creation path: editor.doc.metadata.*
The ergonomic path is the Document API surface. Same operation IDs on the browser editor, the Node SDK, and the CLI:
examples/document-api/metadata-anchors.
For a composed runtime that inserts a draft, attaches per-citation payloads, paints highlights, and drives a sources panel, see demos/custom-ui.
Pure file-shape generation (storage contract, no SDK)
If you are generating DOCX entirely offline (no SuperDoc runtime, no Node SDK in the pipeline), produce the file shape from the Storage contract section directly. Both halves (SDT in the body,<ref> in the custom XML part) must agree on the id for the link to resolve when the file opens in SuperDoc.
This is the path RAG pipelines take when refs are written at generation time, before the file ever opens in SuperDoc. It works, and the DOCX it produces is interchangeable with one written by editor.doc.metadata.attach. A first-class offline / pure-server SDK with the same surface is separate work; until that lands, the file-shape path is the offline answer.
Custom UI binding: ui.metadata.*
UI surfaces stay keyed on the metadata id, never on internal node ids. The ui.metadata.* handle hides the SDT-node-id lookup.
Round-trip and lifecycle
| Scenario | Behavior |
|---|---|
| SuperDoc DOCX export and reopen | Anchors and payloads preserved. |
| Open in Word, save, reopen in SuperDoc | Anchors and payloads preserved. Validated by the deterministic fixtures at tests/doc-api-stories/tests/word-roundtrip/. |
| Edit inside an anchor | SDT expands to wrap inserted text. Payload unchanged. |
| Edit crossing an anchor boundary | Follows Word’s content control semantics: the anchor can split or absorb depending on the edit. |
| User deletes the anchored text | Payload survives in custom XML; metadata.resolve returns null. See the lifecycle note below. |
| Word Document Inspector with “Custom XML Data” selected for removal | Strips payloads. Intentional Word behavior; out of band for SuperDoc. |
metadata.list and metadata.resolve can disagree when an anchor has been deleted but the payload survives in custom XML. Lifecycle cleanup behavior is still evolving; in v1, apps should treat metadata.resolve === null as “anchor gone, hide from UI” and not trust list output as the canonical source of truth.
Cross-surface: same operations everywhere
Anchored metadata is not editor-specific. The same operation IDs are available on every surface that drives SuperDoc.| Surface | Binding |
|---|---|
| Browser editor | editor.doc.metadata.* plus ui.metadata.* |
| Node SDK | bound document handle methods |
| CLI / MCP / agent tools | wrappers generated from the same operation IDs |
Operation reference at a glance
| Concept | Operation |
|---|---|
| Attach a payload to a range | editor.doc.metadata.attach |
List entries (namespace and within filters) | editor.doc.metadata.list |
| Get one payload by id | editor.doc.metadata.get |
| Update payload | editor.doc.metadata.update |
| Remove payload and anchor | editor.doc.metadata.remove |
| Resolve id to its current range | editor.doc.metadata.resolve |
| Painter rect for highlight overlays (browser) | ui.metadata.getRect |
| Scroll viewport to anchored span (browser) | ui.metadata.scrollIntoView |
Next steps
Metadata anchors example
Smallest copy-paste form: attach, list, get, resolve, remove, with one button each.
Custom UI reference workspace
Composed workspace that uses these primitives behind a source-grounded citation flow: highlights, hover popovers, sources panel, DOCX round-trip.
Anchored metadata reference
Every
metadata.* operation with inputs, outputs, and failure codes.Word round-trip fixtures
Deterministic Word-in-the-loop validation for citation round-trips.

