Skip to content
FonteumThe Graph
DataResearchCare CompareThe DifferAttestAPI
See the proof
  • Data
  • Research
  • Care Compare
  • The Differ
  • Attest
  • API
See the proof

Docs → Citation Discovery

Citation discovery pipeline.

Fonteum automatically discovers papers and datasets that cite federal healthcare records published via this platform. The pipeline polls OpenAlex — the open scholarly graph — for works that reference Fonteum source URLs, then routes candidates through a human moderation queue before surfacing them on researcher-facing pages.

How it works

  1. A nightly Inngest function queries OpenAlex for works containing Fonteum dataset URLs in their reference lists.
  2. Each candidate is written to the discovered_citations table with the reference type apex_url (direct link to Fonteum).
  3. Candidates enter the moderation queue. A reviewer approves or rejects each citation. Approved citations appear on /for/researchers and on the relevant study pages.

Reference types

apex_url
A direct hyperlink to a Fonteumpage or API endpoint found in the paper's body or references. This is the only reference type currently issued.

Moderation policy

Only citations from peer-reviewed journals, preprint servers (arXiv, medRxiv, bioRxiv), or established data repositories pass the moderation filter. The moderation team rejects self-citations, spam, and works that reference Fonteum only incidentally. Approved citations are immutable; rejected citations are soft-deleted and never resurfaced.

/researchers — citation index →/docs/provenance-contract — row-level provenance →/methodology — data pipeline methodology →

Built on the authoritative federal record

The primary sources, named on every page.

These are the federal agencies whose public datasets Fonteum ingests and attributes — the issuing authorities, not customers or partners. Every figure on the site links back to one of them.

  • CMS
  • HHS-OIG
  • HRSA
  • FDA
  • NLM
  • NUCC
  • Census
  • BLS
  • BEA

See the full source registry, with license and refresh cadence for each →

Reproducible by design

Every figure traces to its federal source.

14-tuple provenance

Every rendered fact ties to a source URL, dataset ID, snapshot date, row key, and SHA-256 — the full chain-of-custody record.

Reproducible SQL

Each study ships the exact query behind its figures, run against the cited federal snapshot. Re-run it yourself.

Daily reconciliation

Published counts are reconciled against the upstream federal datasets on a daily cadence, with drift logged.

Named medical review

Reviewed by Jennifer Montecillo, MD, medical reviewer. Non-practicing medical reviewer.

Read the full provenance and attestation methodology →

Two doors

Use the free API and open data

Query providers, facilities, sanctions, and quality scores — each field carrying its federal source. Self-serve, no call to start.

Explore the API →Browse the data catalog →

Talk to us

Managed pilots, enterprise terms, and audit-ready, signed attestation packages for compliance, risk, and research teams.

Talk to us →
Fonteum
Products
The DifferAttestAPIFHIR API
Data
Care CompareResearchData catalogSources
Company
Why FonteumAboutPressEditorial policyCorrections
Legal
Privacy policyTerms of serviceMedical disclaimer

Reviewed by Jennifer Montecillo, MD, medical reviewer. Non-practicing medical reviewer.

© 2026 Fonteum LLC. All rights reserved.

The U.S. healthcare graph AI can cite — every fact carries its source.

Request access→

The substrate, by the numbers

44federal source familiesDistinct CMS, OIG, HRSA, FDA and peer datasets
35dataset pagesCitable, downloadable /data catalog pages
52reproducible studiesEach shipping the SQL behind its figures