Improved

June 1, 2026: Improved CDA Parsing - Four Changes

We've made a set of improvements to our parsing of CDAs that enhances accuracy in labeling clinical content and broadens the range of resources we extract. Four areas of our CDA transformer were impacted, across conditions and allergies, coverage, embedded PDF DocumentReferences, and result observations.

Richer metadata on embedded PDF DocumentReferences

CDAs sometimes carry an embedded PDF, such as a scanned note, alongside the structured document. These extracted PDFs are now more identifiable and discoverable instead of showing up as untitled and undated. When the extracted DocumentReference has a relatesTo pointer back to its parent document, the transformer now copies the description, date, and type from the parent DocumentReference.

Left side: Previous embedded PDF Metadata, Right side: Updated metadata


More accurate handling of negated entries

CDAs can mark an entry with a negation indicator (negationInd="true"), which flips the meaning of that entry to say a finding is absent rather than present. The transformer now reads this flag more consistently and emits the appropriate negative code. An entry that documents the absence of any conditions becomes a clear "no known problems" rather than a generic condition that just reads "Problem." Negated Condition entries are emitted with SNOMED CT 160245001 ("No known problems") and negated AllergyIntolerance entries with SNOMED CT 716186003 ("No known allergy").


Coverage resources now parsed for all policy relationships

A patient is often covered by a policy held by someone else, such as a parent or spouse. The transformer previously surfaced only the policies a patient held themselves, so these dependent coverages didn't come through. It now parses a Coverage resource for any active CoverageRoleType code, so a patient's full set of policies is captured. When the patient appears under another relationship such as famdep, the Coverage is emitted with subscriber and policyHolder populated when the source CDA carries them.


Fuller note text on result observations

Lab and diagnostic reports often include interpretation or comment text that sits in a separate part of the document from the main result. For result observations, the transformer now captures that text in addition to the primary observation node, so content such as the comments on a genetic testing report comes through. It walks deeper into comment child entries and appends their text to Observation.note, grouping entries in document order, each on a new line. Other observation types are unchanged.