Skip to content
FonteumThe Graph
DataResearchCare CompareThe DifferAttestAPI
See the proof
  • Data
  • Research
  • Care Compare
  • The Differ
  • Attest
  • API
See the proof

FOR · RAG DEVELOPERS

PBJ rows

FHIR R4, retrieval-optimized for RAG.

Pre-resolved references, flat JSON, citation-ready provenance on every resource.

Read the API docs →

The problem · FHIR + RAG

FHIR JSON is reference-heavy. Pre-resolved references and pre-chunked text fix that.

Nested references

A FHIR Practitioner resource references PractitionerRole, which references Organization, which references Location — you need 4 round trips to build context.

Fonteum: Fonteum pre-resolves all references. The response you get is a flat, fully-populated bundle ready for chunking.

Token bloat

Raw FHIR JSON carries coding system URIs, meta fields, and extension blocks your model doesn't need. A single Practitioner can run 300+ tokens.

Fonteum: The /api/v1/rag/chunks export returns pre-chunked text with stable chunk IDs — same clinical data, fewer tokens per chunk than raw FHIR JSON.

Missing citations

When an LLM cites a provider fact, you need a traceable source and date. Standard FHIR resources don't carry that.

Fonteum: Every Fonteum resource carries meta.source and a provenance tag block: source name, last-checked date, and display rule.


Token efficiency · FHIR JSON vs chunk text

The same provider data. Half the tokens.

Toggle between the raw FHIR JSON response and the compact text that the /api/v1/rag/chunks export returns. Same provenance, same clinical data, fewer tokens in your context window.

58% more tokens than chunk text
{
  "resourceType": "Practitioner",
  "id": "prac-1003894328",
  "meta": {
    "tag": [
      { "system": "fonteum:provenance", "code": "cms-nppes" },
      { "system": "fonteum:last-checked", "code": "2026-05-24" }
    ]
  },
  "identifier": [
    { "system": "http://hl7.org/fhir/sid/us-npi", "value": "1003894328" }
  ],
  "name": [{ "family": "Nguyen", "given": ["Emily"], "prefix": ["MD"] }],
  "address": [
    {
      "use": "work",
      "line": ["400 Park Ave"],
      "city": "New York",
      "state": "NY",
      "postalCode": "10022"
    }
  ],
  "qualification": [
    {
      "code": {
        "coding": [
          {
            "system": "http://nucc.org/provider-taxonomy",
            "code": "207RC0000X",
            "display": "Cardiovascular Disease"
          }
        ]
      }
    }
  ]
}

LangChain · integration walkthrough

Load Fonteum into LangChain.

The @fonteum/mcp package ships a createFonteumLangChainTools helper that wires all five Fonteum tools into LangChain. Pass the result as the tools array to any LangChain agent. Every tool response carries metadata.source from the Fonteum provenance block.

import { tool } from "@langchain/core/tools";
import { createFonteumLangChainTools } from "@fonteum/mcp/integrations/langchain";

// All five tools: search, resolve-by-NPI, exclusion check, dataset info, source list
const fonteumTools = createFonteumLangChainTools(tool, {
  apiKey: process.env.FONTEUM_API_KEY, // omit for free demo access
});

For MCP and Python paths, see docs/integrations.


LlamaIndex · VectorStoreIndex

Index provider data with LlamaIndex.

Load pre-chunked provider data from /api/v1/rag/chunks directly into a LlamaIndex VectorStoreIndex. Each node carries a stable chunk ID and provenance metadata for downstream citation generation.

import requests
from llama_index.core import VectorStoreIndex
from llama_index.core.schema import TextNode

nodes, cursor = [], 0
while cursor is not None:
    page = requests.get(
        "https://fonteum.com/api/v1/rag/chunks",
        params={"limit": 200, "cursor": cursor},
    ).json()
    for c in page["chunks"]:
        nodes.append(TextNode(
            text=c["text"],
            id_=c["chunk_id"],
            metadata={"cite": c["cite"], "source_url": c["source_url"], **c["provenance"]},
        ))
    cursor = page["next_cursor"]

index = VectorStoreIndex(nodes)
query_engine = index.as_query_engine()
response = query_engine.query("cardiologists accepting Medicare in Manhattan")

FONTEUM FOR RAG

Every chunk arrives with 14 fields of provenance.

Read the API docs→ View dataset coverage

Export · /api/v1/rag/chunks

Pre-chunked, citation-ready text for your vector store.

One GET returns a deterministic, paginated feed of dataset chunks — no language model in the pipeline, byte-stable across pulls so re-embedding never churns your index. Every chunk arrives with a stable chunk_id (use it as your vector primary key), a cite string for footnotes, the upstream source_url, and the full 14-field provenance contract. Walk the corpus with ?limit= and the returned next_cursor.

# Plain REST — page through the corpus
curl "https://fonteum.com/api/v1/rag/chunks?limit=50&cursor=0"

# → { "total": N, "next_cursor": 50, "chunks": [
#     { "chunk_id": "source:nppes#overview",
#       "text": "...", "cite": "...", "source_url": "https://...",
#       "provenance": { "_source": "...", "_dataset_id": "...", ...14 fields } } ] }

Load it straight into a LangChain vector store. Each chunk maps to a Document whose metadata carries the provenance block — so generated citations trace back to the federal source and snapshot date.

import requests
from langchain_core.documents import Document
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

docs, cursor = [], 0
while cursor is not None:
    page = requests.get(
        "https://fonteum.com/api/v1/rag/chunks",
        params={"limit": 200, "cursor": cursor},
    ).json()
    for c in page["chunks"]:
        docs.append(Document(
            page_content=c["text"],
            id=c["chunk_id"],
            metadata={
                "cite": c["cite"],
                "source_url": c["source_url"],
                "dataset_id": c["dataset_id"],
                **c["provenance"],
            },
        ))
    cursor = page["next_cursor"]

index = FAISS.from_documents(docs, OpenAIEmbeddings())

Public endpoint, rate limited per source IP. Deterministic ids mean an incremental re-pull upserts cleanly — no duplicate vectors.

See a live sample response →

Cited answers · /api/v1/ask

Ask a structured question, get cited facts back.

A typed question (ownership, sanctions, payments) plus an identifier returns a fact list where every fact carries the full 14-field provenance contract. Deterministic, no language model — “0 records match” is an answer, never a silent empty. The contract is strict: 400 on an unknown question / scheme / malformed value, 502 when a source read fails (never masked as an empty answer), and a hard internal gate refuses any fact missing its provenance.

curl "https://fonteum.com/api/v1/ask?question=sanctions&scheme=npi&value=1003000118"

# → { "question": "sanctions",
#     "answer_text": "0 OIG LEIE exclusion records match NPI 1003000118.",
#     "facts": [ /* each with a 14-field provenance block */ ],
#     "citations": [ { "source": "HHS OIG LEIE", "source_url": "..." } ] }

Token counts · by resource type

Average token counts per resource.

ResourceJSON tokensChunk tokensReduction
Practitioner312148−53%
Organization298131−56%
Location18789−52%
PractitionerRole224104−54%
HealthcareService341162−52%

Token counts measured with tiktoken cl100k_base on a representative sample of 500 records per resource type. Actual counts vary by record.


Latency benchmarks · under load

Sub-300 ms at p99.

PercentileFHIR endpoint
p5038 ms
p95142 ms
p99290 ms
p99.9480 ms

Measured at the Vercel edge with 50 concurrent connections. Latency is gateway-to-response-complete. Source data is served from a warm CDN cache; cold-cache adds ~80 ms.

Get API access →

FONTEUM · PILOT

Run a 90-day pilot. Public data only. No PHI.

Request access→ Read the methodology

Built on the authoritative federal record

The primary sources, named on every page.

These are the federal agencies whose public datasets Fonteum ingests and attributes — the issuing authorities, not customers or partners. Every figure on the site links back to one of them.

  • CMS
  • HHS-OIG
  • HRSA
  • FDA
  • NLM
  • NUCC
  • Census
  • BLS
  • BEA

See the full source registry, with license and refresh cadence for each →

Reproducible by design

Every figure traces to its federal source.

14-tuple provenance

Every rendered fact ties to a source URL, dataset ID, snapshot date, row key, and SHA-256 — the full chain-of-custody record.

Reproducible SQL

Each study ships the exact query behind its figures, run against the cited federal snapshot. Re-run it yourself.

Daily reconciliation

Published counts are reconciled against the upstream federal datasets on a daily cadence, with drift logged.

Named medical review

Reviewed by Jennifer Montecillo, MD, medical reviewer. Non-practicing medical reviewer.

Read the full provenance and attestation methodology →

Two doors

Use the free API and open data

Query providers, facilities, sanctions, and quality scores — each field carrying its federal source. Self-serve, no call to start.

Explore the API →Browse the data catalog →

Talk to us

Managed pilots, enterprise terms, and audit-ready, signed attestation packages for compliance, risk, and research teams.

Talk to us →
Fonteum
Products
The DifferAttestAPIFHIR API
Data
Care CompareResearchData catalogSources
Company
Why FonteumAboutPressEditorial policyCorrections
Legal
Privacy policyTerms of serviceMedical disclaimer

Reviewed by Jennifer Montecillo, MD, medical reviewer. Non-practicing medical reviewer.

© 2026 Fonteum LLC. All rights reserved.

The U.S. healthcare graph AI can cite — every fact carries its source.

Request access→

The substrate, by the numbers

9.2Mgraph entitiesProviders, organizations, owners, and facilities
12.5Mlinked identifiersNPIs, CCNs, LEIs and more, resolved to entities
4.7Mgraph edgesSource-attested relationships between entities
44federal source familiesDistinct CMS, OIG, HRSA, FDA and peer datasets
33dataset pagesCitable, downloadable /data catalog pages
48reproducible studiesEach shipping the SQL behind its figures