AINPI · v1.0.0

The State of the National Provider Directory

Six pre-registered findings against the CMS NPD 2026-04-09 release.

NPD release: 2026-04-09
Methodology: v0.6.0-draft
Resources audited: 27.2M
Findings published: 6 / 6

Author: Eugene Vestel, FHIR IQ · gene@fhiriq.com · ainpi.vercel.app
Generated at: 2026-04-21T14:00:00Z

Executive summary

The CMS National Provider Directory (NPD) ships 27.2M FHIR R4 resources across six resource types. AINPI is an independent, open-source audit of that bulk export. Six pre-registered hypotheses (bundled into six finding slugs) were declared publicly before the numbers dropped; this report publishes the results.

The findings cluster in four narratives. First, the NPD's own meta.lastUpdated timestamp is a bulk-export stamp, not a per-resource freshness signal — the 30-day and 90-day regulatory update cadences cannot be measured from the bulk files. Second, referential integrity is essentially perfect where references are declared, but coverage of the Endpoint↔Organization link is sparse in both directions. Third, the NPD uses two different specialty code systems on two different resources with no cross-walk. Fourth, declared FHIR-REST endpoints clear the 85% Medicare Advantage network-adequacy implied ceiling on basic reachability but not on SMART discovery.

Every number in this report is reproducible from the scripts in the analysis/ directory of the repository. Methodology at /methodology.

H1 · H2 · H3 · H4 · H5

Endpoint liveness

Full crawl of 2,974 distinct FHIR-REST hosts in the NDH: 93.3% answered HTTP, 85.4% served a parseable CapabilityStatement, 81.6% published valid SMART well-known, 90.3% answered an unauthenticated Practitioner?_count=1 with 200/401. Across the full NDH endpoint population: 5,043,524 endpoints total (74.2% FHIR-REST, 25.8% Direct Project); 98.7% of Organizations carry zero Endpoint references.

2.5K / 3.0K = 85.37%

L3 any HTTP93.3%

L3 strict (200/30x/401)62.4%

L4 CS parseable85.4%

L5 CS conformant85.4%

L6 SMART valid81.6%

L7 unauth search90.3%

unit: percent

Null hypothesis

FHIR-REST endpoints are reachable and conformant at or above the implied 85% network-adequacy ceiling.

Denominator

All `Endpoint` resources with `connectionType` in the FHIR-REST family.

Data source

CMS NPD bulk export + live HTTP probes run by the `ainpi-probe` crawler against declared `Endpoint.address` URLs.

Notes

Probed 2,974 distinct FHIR-REST hosts (one endpoint per host, stratified by host). Crawl: 16 global concurrency, 1 rps per host, 10s connect / 30s read, exponential backoff on 429/503, User-Agent AINPI-DirectoryQualityBot/1.0. fhirVersion declared by L5-conformant hosts: {'4.0.1': 1599, '4.0.0': 938, '3.0.2': 1, '3.0.1': 1}. CapabilityStatement software top-5: {'Firely Server': 505, 'Epic': 456, 'Fhir Server': 282, 'Altera FHIR': 71, '1up FHIR Server': 42}. Highest level reached distribution: {7: 2687, 0: 33, -1: 136, 1: 23, 2: 28, 3: 39, 6: 14, 5: 14}. Certs expiring within 90 days of release: 271. H4 (connection type distribution) and H5 (orgs without endpoints) computed over the full population in BigQuery: H4 connection types: {'hl7-fhir-rest': 3741547, 'direct-project': 1301977}; H5 3,603,262 orgs of which 3,556,049 have no managingOrganization reference from any Endpoint.

H9 · H10 · H11 · H12 · H13

NPI and taxonomy correctness

95.72% of 10.9M NDH NPIs clear NPPES (0.79% ghost, 3.49% deactivated). Practitioner name agreement: 94.9% exact → 95.3% normalized → 97.9% Jaro-Winkler ≥0.85. Organization name: 56.3% exact → 88.0% normalized → 98.8% Jaro-Winkler ≥0.85 (closes the 44-point exact-match gap to 1pp). NDH carries NUCC on Practitioner.qualification (99.83% valid) AND Medicare Specialty codes on PractitionerRole.specialty (99.98% valid against the CMS-published crosswalk). Internal cross-system consistency: 85.8% of 3.3M Practitioner↔Role pairs agree via the crosswalk. External NUCC agreement NDH↔NPPES: 93.7% match NPPES's switch='Y' TRUE primary, 99.7% match any of the 15 slots, 6.0% match only a secondary. Slot_1 is NOT always the true primary (14.93% of rows).

2.9M / 3.3M = 85.79%

H10 NPPES match OK95.7%

H10 not in NPPES0.789%

H10 deactivated in NPPES3.49%

H11 Prac exact94.9%

H11 Prac normalized95.3%

H11 Prac JW ≥0.8597.9%

H11 Org exact56.3%

H11 Org normalized88.0%

H11 Org JW ≥0.8598.8%

H12 NUCC valid99.8%

H12 CMS code valid100.0%

H13 internal crosswalk85.8%

H13 NDH↔NPPES slot 192.0%

H13 NDH↔NPPES true primary93.7%

H13 NDH↔NPPES any of 1599.7%

unit: percent

Null hypothesis

NPI structural validity is ≥99.9% and NDH-to-NPPES agreement on name and primary specialty is within documented drift thresholds.

Denominator

All `Practitioner` and `Organization` resources with an NPI identifier.

Data source

CMS NPD bulk export joined against the NPPES monthly full dissemination file (V.2) and the current NUCC quarterly code set.

Notes

Source: bigquery-public-data.nppes.npi_raw (updated 2026-02-09, 9.37M NPIs) + .healthcare_provider_taxonomy_code_set_170 + CMS Medicare Provider and Supplier Taxonomy Crosswalk (2025-10, 565 rows, 1-to-many). H11 v2 methodology — three tiers: (1) exact match on UPPER(TRIM), (2) normalized match that strips business suffixes (LLC/INC/CORP/PC/PA/PLLC/LLP/LTD/CO/COMPANY/THE for Orgs; JR/SR/II–V/MD/DO/PHD/RN/NP/PA-C/FNP-BC/DMD/DDS/DVM/PHARMD for persons), drops non-alphanumeric, collapses whitespace, (3) Jaro-Winkler ≥0.85 via a BQ JS UDF. Practitioner name: 6,893,725/7,139,700 family exact, 6,805,038 normalized full match, 6,990,597 at JW≥0.85, 6,833,522 at JW≥0.95. Organization name: 1,840,638/3,270,089 exact, 2,878,882 normalized, 3,229,845 at JW≥0.85, 3,119,323 at JW≥0.95. H12: NUCC codes on Practitioner.qualification (7,112,042/7,124,017 valid in NUCC v17.0); Medicare codes on PractitionerRole.specialty (3,344,800/3,345,518 valid in the crosswalk). NDH PractitionerRole._specialty_code carries a leading 'NN-' prefix (e.g. '14-50'); stripping recovers the canonical Medicare code. H13 internal: 3,337,053 Practitioner↔Role pairs, 2,862,934 agree via crosswalk. H13 confusion matrix — top 10 inconsistent (Medicare → qualification-NUCC) pairs: C6 (PRACTITIONER - HOSPITALIST) ↔ 207R00000X (Internal Medicine /): 39,150; 80 (PRACTITIONER - CLINICAL SOCIAL WORKER) ↔ 104100000X (Social Worker /): 21,026; 30 (PRACTITIONER - DIAGNOSTIC RADIOLOGY) ↔ 2085R0204X (Radiology / Vascular & Interventional Radiology): 17,599; 29 (PRACTITIONER - PULMONARY DISEASE) ↔ 207RC0200X (Internal Medicine / Critical Care Medicine): 12,819; 08 (PRACTITIONER - FAMILY PRACTICE) ↔ 207P00000X (Emergency Medicine /): 11,904; 68 (PRACTITIONER - CLINICAL PSYCHOLOGIST) ↔ 103T00000X (Psychologist /): 8,637; 06 (PRACTITIONER - CARDIOVASCULAR DISEASE (C) ↔ 207RI0011X (Internal Medicine / Interventional Cardiology): 8,133; 26 (PRACTITIONER - PSYCHIATRY) ↔ 2084P0804X (Psychiatry & Neurology / Child & Adolescent Psychi): 7,084; 05 (PRACTITIONER - ANESTHESIOLOGY) ↔ 390200000X (Student in an Organized Health Care Education/Trai): 6,835; 50 (PRACTITIONER - NURSE PRACTITIONER) ↔ 207Q00000X (Family Medicine /): 6,628. H13 external (v3 — switch-aware): NPPES stores 15 (taxonomy_code, primary_switch) pairs per NPI; exactly one should have switch='Y' (the TRUE primary). Four buckets: • Match NPPES true primary (switch='Y' slot): 6,672,407 (93.66%) • Match any slot: 7,099,905 (99.66%) • Match slot_1 specifically:6,555,738 (92.02%) • Match only a secondary (switch='N'): 427,498 (6.00%) • Disagree entirely (not in any slot): 24,112 (0.34%) Slot-ordering observation: 1,063,861 rows (14.93%) have the NPPES TRUE primary in a slot other than slot_1 — so the prior 'slot_1' proxy for 'primary' was slightly wrong. 0 rows (0.00%) have no switch='Y' at all (NPPES data-quality edge). Known caveats: NPPES vintage 2026-02-09 vs NDH 2026-04-09 — 8-week gap means taxonomy changes in that window show as disagreement; Jaro-Winkler ≥0.85 is a permissive threshold that recovers common variations (whitespace, DBA suffixes, casing) but also accepts some false positives (e.g. 'Smith Medical' vs 'Smith Medicare'); the 0.95 column is the strict signal. v2 upgrade candidates: pinned quarterly NUCC; NPPES secondary-taxonomy match; phonetic fallback (Soundex / Metaphone) for names where JW misses transpositions.

H18

Temporal staleness

100.0% of NPD resources carry a meta.lastUpdated value on the release day (2026-04-09). Distinct meta.lastUpdated values range from 1 to 5 across the 6 resource types — meta.lastUpdated on the NPD bulk public-use files is a bulk-export stamp, not a per-resource freshness signal.

27.2M / 27.2M = 100.00%

practitioner1

organization5

location1

endpoint1

practitioner_role1

organization_affiliation1

unit: count

Null hypothesis

A majority of resources carry a `meta.lastUpdated` within the 90-day statutory threshold.

Denominator

All resources across all six NDH resource types that carry a populated `meta.lastUpdated`.

Data source

CMS NPD bulk export (pinned release).

Notes

Per-resource distinct meta.lastUpdated values on release day 2026-04-09: practitioner → 1 distinct, 100.00% at modal; organization → 5 distinct, 55.48% at modal; location → 1 distinct, 100.00% at modal; endpoint → 1 distinct, 100.00% at modal; practitioner_role → 1 distinct, 100.00% at modal; organization_affiliation → 1 distinct, 100.00% at modal. Regulatory compliance with the 30-day CMS-9115-F or 90-day REAL Health Providers Act / No Surprises Act update cadence CANNOT be measured from meta.lastUpdated on the bulk files — a per-record freshness signal from upstream NPPES (enumeration_date / last_updated) or PECOS would be required.

H6 · H7 · H8

Referential integrity

Referential integrity is clean but coverage is sparse. 0.000% of 17.0M declared cross-resource references actually dangle (target missing). But only 3.0% of Endpoints carry a managingOrganization (149,080 of 5,043,524) and only 76.0% of Locations do (2,654,922 of 3,494,239). H8: the NPD bulk export does not ship HealthcareService (NDH IG defines 10 resources; NPD ships 6).

0 / 17.0M = 0.00%

PR → Practitioner (coverage)100.0%

PR → Organization (coverage)98.0%

Location → Org (coverage)76.0%

Endpoint → Org (coverage)2.96%

unit: percent

Null hypothesis

Cross-resource references resolve at ≥99% inside the bulk export.

Denominator

All reference fields across `PractitionerRole.practitioner`, `PractitionerRole.organization`, `Location.managingOrganization`, and `HealthcareService.providedBy`.

Data source

Edge tuples extracted from the NPD bulk export in a single streaming pass, queried in DuckDB.

Notes

Integrity (dangling rate among declared references): H6a PR→Practitioner 0.0000%, H6b PR→Organization 0.0000%, H7 Location→Org 0.0000%, Endpoint→Org 0.0000%. All near zero — when a reference is declared, it resolves. Coverage (share of rows with the optional reference populated): PR→Practitioner 100.00% (required), PR→Organization 97.98%, Location→managingOrganization 75.98%, Endpoint→managingOrganization 2.96%. The Endpoint→Organization gap pairs with H5 (98.69% of Orgs have no Endpoint referencing them) — the Endpoint↔Organization link is sparse in both directions. H8 requires HealthcareService, which is one of four NDH IG resources (HealthcareService, InsurancePlan, Network, Verification) absent from the 2026-04-09 NPD bulk export. Any HealthcareService-based check cannot be performed from NPD alone.

H14 · H15

Duplicate detection

Practitioner dedup is clean — 0 excess rows across 7,441,212 NPIs (H14). But Organizations multiply: 70.5% of the 1,999,118 unique Org NPIs map to more than one Organization resource (1,415,777 excess rows; max 5 resources per one NPI). By normalized (name, state, city), 70.3% of keys repeat. Downstream consumers assuming one Organization resource = one real-world entity will be wrong roughly two out of three times.

1.2M / 9.2M = 13.35%

H14 Practitioner by NPI0.000%

H15 Org by name+state+city70.3%

H15b Org by NPI70.5%

unit: percent

Null hypothesis

Duplicate rate is below 1% for both Practitioner (by NPI) and Organization (by normalized name + address).

Denominator

All `Practitioner` and `Organization` resources.

Data source

CMS NPD bulk export.

Notes

BigQuery dataset has primary-key dedup applied at ingest (-4.6M Practitioner, -383K Organization at _id). These are residual entity-level duplicates. H14 key = _npi on practitioner. Max copies observed: 1 for a single Practitioner NPI. H15 key = (LOWER(name) stripped of LLC/INC/PC/PA/PLLC/CORP/LLP/LTD/CO/COMPANY/THE and non-alphanumerics, UPPER(state), UPPER(TRIM(city))); orgs with missing name or state or city are excluded. Max copies for one key: 2206. H15-bonus keys by _npi; max copies for one Org NPI: 5. Caveat — some portion of the Organization multiplicity may reflect CMS modeling one FHIR Organization resource per service location rather than true duplication. Either interpretation breaks the common downstream assumption that COUNT(Organization) equals the number of unique organizations. Fuzzy matching (Jaro-Winkler, suite-unit tolerance) is a v2 enhancement.

H22

Network adequacy gauge

Empirical FHIR endpoint liveness vs the 85% Medicare Advantage network-adequacy implied ceiling: L7 unauthenticated-read 90.3% (ABOVE), L5 CapabilityStatement conformance 85.4% (AT), L6 SMART well-known 81.6% (BELOW). Gauge sampled across 2,974 distinct FHIR-REST hosts in the NDH.

2.7K / 3.0K = 90.35%

Regulatory ceiling (implied)85.0%

L7 unauth Practitioner read90.3%

L5 CS conformance85.4%

L6 SMART well-known81.6%

unit: percent

Null hypothesis

Measured endpoint liveness matches or exceeds the 85% regulatory ceiling.

Denominator

All FHIR-REST endpoints declared in the NPD bulk export at the pinned release.

Data source

`ainpi-probe` crawler results joined to the `Endpoint` resource table.

Notes

The 85% network-adequacy ceiling is the implied minimum active provider share under Medicare Advantage adequacy rules (42 CFR §422.116). This comparison maps 'adequacy' onto technical reachability and conformance — NOT onto the regulatory definition itself, which concerns whether a sufficient share of the network is active, not whether its FHIR endpoints respond. Interpret as: if consumers assume the FHIR directory surface offers a regulatory-equivalent conformance floor, that assumption holds only on unauthenticated basic reachability (L7 90.3%) and collapses on SMART discovery (81.6% vs 85%). Probe methodology: 2,974 distinct FHIR-REST hosts, one endpoint per host, stratified by host-fingerprint, via ainpi-probe L0-L7 with 1 rps per host rate limit and 10s connect / 30s read timeouts.

About this report

This report covers the AINPI v1.0.0 release. All methodology, pipeline code, BigQuery analyses, and the FHIR endpoint crawler are open source under Apache-2.0:

Main repo: github.com/FHIR-IQ/AINPI
Crawler: github.com/FHIR-IQ/ainpi-probe
Usage examples: github.com/FHIR-IQ/ainpi-examples
Live site: ainpi.vercel.app
Public URL contract: ainpi.vercel.app/api/v1/stats.json

To cite: see CITATION.cff. To contribute: see CONTRIBUTING.md. To subscribe: ainpi.vercel.app/subscribe.