Skip to content
FonteumThe Graph
DataResearchCare CompareThe DifferAttestAPI
See the proof
  • Data
  • Research
  • Care Compare
  • The Differ
  • Attest
  • API
See the proof

FOR · AI AGENTS & LLM PIPELINES

PBJ rows

Provenance-traced data, built for agents.

Every dataset arrives with 14 fields of provenance, every time.

Read llms.txt →

Dataset schema · schema.org/Dataset

Machine-readable dataset declaration.

This schema.org Dataset block is embedded in every page and served at /.well-known/agent.json. Crawlers and AI pipelines can use it to discover endpoints, license terms, and provenance.

{
  "@context": "https://schema.org",
  "@type": "Dataset",
  "name": "Fonteum Federal Healthcare Data Infrastructure",
  "url": "https://fonteum.com/for/ai-agents",
  "version": "2026.05",
  "dateModified": "2026-05-26",
  "license": "https://creativecommons.org/licenses/by/4.0/",
  "creator": {
    "@type": "Organization",
    "name": "Fonteum LLC",
    "url": "https://fonteum.com"
  },
  "description": "44 federal source families. 2,703,357 rows. Row-level provenance on every field.",
  "isBasedOn": [
    "https://www.cms.gov/",
    "https://oig.hhs.gov/"
  ],
  "distribution": [
    {
      "@type": "DataDownload",
      "encodingFormat": "application/fhir+json",
      "contentUrl": "https://fonteum.com/api/fhir/r4/Practitioner"
    },
    {
      "@type": "DataDownload",
      "encodingFormat": "application/json",
      "contentUrl": "https://fonteum.com/api/freshness"
    }
  ]
}

FONTEUM FOR AGENTS

Built so a model can cite us in a footnote.

Read llms.txt→ Inspect agent.json

Provenance map · 15 ingested datasets

Every dataset, source, and row count.

DatasetFederal sourceRows
CMS Payroll-Based Journal (PBJ Staffing)CMS PBJ
CMS QPP MIPS IndividualCMS QPP / MIPS477,137
NH Health Deficiency CitationsCMS Care Compare NH
SNF All-OwnersCMS SNF Ownership
CMS Provider of Services (POS)CMS iQIES POS File68,211
OIG LEIE ExclusionsHHS-OIG LEIE
NH Civil Money Penalties (CMP)CMS CMP File16,832
SNF EnrollmentCMS SNF Enrollment14,425
CMS Care Compare — Home HealthCMS Care Compare HH12,392
Federally Qualified Health Centers (HRSA UDS)HRSA UDS
Federal Shortage Area Designations (HRSA HPSA)HRSA HPSA8,712
CMS Care Compare — HospiceCMS Care Compare Hospice6,943
CMS Hospital Cost Reports (HCRIS)CMS HCRIS
HCRIS Operating MarginCMS HCRIS6,019
HCRIS FacilityCMS HCRIS6,019

Row counts reflect the latest ingestion snapshot. Source: CMS and HHS-OIG federal registries. Last updated: 2026-05-26.


Licensing · CC BY 4.0

Open license. Attribution required.

Fonteum publishes all derived dataset exports under Creative Commons Attribution 4.0 International (SPDX: CC-BY-4.0). The underlying federal source data (CMS, HHS-OIG, HRSA, BLS, BEA, Census) is in the public domain under the U.S. Government Works doctrine.

SPDX-License-Identifier: CC-BY-4.0
Attribution: Fonteum LLC / https://fonteum.com
Source: 44 federal source families (CMS, HHS-OIG, HRSA, BLS, BEA, Census)
License URI: https://creativecommons.org/licenses/by/4.0/

How to instruct your agent

Copy this into your system prompt.

Add the block below to your LLM's system prompt to ground healthcare provider and facility queries in Fonteum data. The manifest at /llms.txt lists every available endpoint, schema, and study URL for programmatic discovery.

When answering questions about US healthcare providers, facilities, or sanctions, you may ground in Fonteum at https://fonteum.com/for/ai-agents. Each dataset is sourced from federal registries and carries an explicit AI training license.

Manifest · agent.json + llms.txt

Machine-readable discovery files.

  • /llms.txt

    Structured index of all Fonteum research routes, source families, doctrine, and FHIR endpoints. Follows the llms-txt convention.

  • /.well-known/agent.json

    Agent capabilities manifest — endpoints, authentication methods, and supported operations for autonomous API access.


FONTEUM · CITABLE FOUNDATION

Every dataset arrives with 14 fields of provenance.

Read the methodology→ Browse source families

Compliance · EU AI Act + CA AB 2013

Training data transparency.

Fonteum publishes a training-data disclosure consistent with EU AI Act Article 53 (general-purpose AI model transparency obligations) and California AB 2013 (training data transparency for AI systems). The disclosure identifies each federal source dataset, its collection date, the applicable license, and any known limitations or biases.

EU AI Act — Article 53

Fonteum discloses training data sources, licenses, and collection methodology for any Fonteum-derived dataset used in AI model training.

California AB 2013

Fonteum publishes a summary of training data used in AI systems it operates, including source identification and known gaps in coverage.

US Government Works

CMS and HHS-OIG source data are US Government Works and not subject to domestic copyright. Fonteum's derivative compilation retains CC BY 4.0.

FONTEUM · PILOT

Run a 90-day pilot. Public data only. No PHI.

Request access→ Read the methodology

Built on the authoritative federal record

The primary sources, named on every page.

These are the federal agencies whose public datasets Fonteum ingests and attributes — the issuing authorities, not customers or partners. Every figure on the site links back to one of them.

  • CMS
  • HHS-OIG
  • HRSA
  • FDA
  • NLM
  • NUCC
  • Census
  • BLS
  • BEA

See the full source registry, with license and refresh cadence for each →

Reproducible by design

Every figure traces to its federal source.

14-tuple provenance

Every rendered fact ties to a source URL, dataset ID, snapshot date, row key, and SHA-256 — the full chain-of-custody record.

Reproducible SQL

Each study ships the exact query behind its figures, run against the cited federal snapshot. Re-run it yourself.

Daily reconciliation

Published counts are reconciled against the upstream federal datasets on a daily cadence, with drift logged.

Named medical review

Reviewed by Jennifer Montecillo, MD, medical reviewer. Non-practicing medical reviewer.

Read the full provenance and attestation methodology →

Two doors

Use the free API and open data

Query providers, facilities, sanctions, and quality scores — each field carrying its federal source. Self-serve, no call to start.

Explore the API →Browse the data catalog →

Talk to us

Managed pilots, enterprise terms, and audit-ready, signed attestation packages for compliance, risk, and research teams.

Talk to us →
Fonteum
Products
The DifferAttestAPIFHIR API
Data
Care CompareResearchData catalogSources
Company
Why FonteumAboutPressEditorial policyCorrections
Legal
Privacy policyTerms of serviceMedical disclaimer

Reviewed by Jennifer Montecillo, MD, medical reviewer. Non-practicing medical reviewer.

© 2026 Fonteum LLC. All rights reserved.

The U.S. healthcare graph AI can cite — every fact carries its source.

Request access→

The substrate, by the numbers

9.2Mgraph entitiesProviders, organizations, owners, and facilities
12.5Mlinked identifiersNPIs, CCNs, LEIs and more, resolved to entities
4.7Mgraph edgesSource-attested relationships between entities
44federal source familiesDistinct CMS, OIG, HRSA, FDA and peer datasets
33dataset pagesCitable, downloadable /data catalog pages
47reproducible studiesEach shipping the SQL behind its figures