Data Marts
As a standard for electronically exchanging healthcare information, FHIR is designed to be interoperable through its flexibility and extensibility. It is optimized for real-time transactional interaction on a single resource or bundle of resources. However, FHIR is not designed for analytics use cases. FHIRβs heavily nested and embedded structure is challenging for the majority of existing analytics infrastructure that relies on βflatβ storage.
Therefore, to support population health analytics, Zus offers data marts that extract and transform FHIR data into a relational format for SQL-based exploration and analysis.
Getting started
Accessing your data mart
If your organization already has a Snowflake account, you can provide your account identifier to Zus to set up secure data sharing. Otherwise, Zus can provision and manage read-only reader accounts for your users that enable them to access and query the data mart. Your organization will be charged for the consumption associated with the reader account.
If you would like access to your data mart, please submit a support ticket
Logging into your Snowflake account
If you are logging into a Zus-provided Snowflake account, navigate to the URL provided to you https://zus-{company_name}.snowflakecomputing.com/console/login#/
and click on "Sign in using Zus Health" to login using your Zus Health account.
Note: Access to Zus data marts is separate from access to Zus UIs/APIs. To receive access to Zus UIs/API, customer should follow the steps listed for setting up users via API or via the Zus App. Users with only data mart access are provided a user type of "Permissionless". Granting a user a role of Builder Admin or Care Team User will allow that user to access Zus UIs/APIs.
Sharing to another Snowflake account
In addition to accessing data marts via a Snowflake reader account, Zus can also share data marts to customer-owned Snowflake accounts. To set up a Snowflake share to your organization's account, please reach out to your account manger and provide the following information:
- Confirm the account is using Business Critical Edition and that your organization has signed a BAA with Snowflake
- Snowflake Region (e.g. US East)
- Account Locator (format should be XXX######, 3 letters followed by 6 numbers)
- Organization name (i.e. SELECT CURRENT_ORGANIZATION_NAME())
- Account name (i.e. SELECT CURRENT_ACCOUNT_NAME())
The screenshot below demonstrates how to find this information for your account in Snowflake.
Getting familiar with Zus data and the schema
Now that you are logged into Snowsight, click on Data in the left sidebar. You should see a database called ZUS_{COMPANY_NAME}_EXPORT in the panel directly to the right of the left sidebar. If you click on the right arrow next to the database to expand its schemas, click on the right arrow next to the {COMPANY_NAME}_RELATIONAL schema to expand the schemaβs contents, and click on the right arrow next to Views, you should see about 200 views available to you to query.
For more information about the data and schema, please see our:
Supported resource types
The following resource types are exported as data mart tables: AllergyIntolerance, Appointment, Consent, CarePlan, CareTeam, Condition, Coverage, Device, DeviceUseStatement, DiagnosticReport, DocumentReference, Encounter, FamilyMemberHistory, Goal, Immunization, Location, Medication, MedicationAdministration, MedicationDispense, MedicationRequest, MedicationStatement, Observation, Organization, Patient, Practitioner, Procedure, and RelatedPerson.
Table types
The Zus Aggregated Profile (or ZAP) is an up-to-date, comprehensive record of a given patient's health history, assembled and curated by Zus from various external data networks. The ZAP is structured to include first-party data written and managed by your Builder organization, Zus-summarized and de-duplicated (Lens) data, and third-party data from EHR networks, pharmacy-sourced medication data, admission notifications (ADT), and lab notifications via the Zus subscription.
There are three types of tables in the data mart:
Table Type | Description | Examples |
---|---|---|
Base FHIR resource | - Includes first-party and third-party data - Excludes Lens data | PATIENT PRACTITIONER CONDITION ENCOUNTER MEDICATION_STATMENT ... |
Lens | - Only include Lens data - Name prefixed with LENS_ | LENS_RXNORM_MEDICATION_STATEMENT LENS_SNOMED_CONDITION LENS_TRANSITION_OF_CARE |
Data Type | - Includes general-purpose and special-purpose data types, which are a reusable cluster of elements in a FHIR resource - Referenced by FHIR resource and Lens table types | CONTACT_POINT DOSAGE HUMAN_NAME IDENTIFIER |
Data sources
The DATA_SOURCE column is available as a column for every base FHIR resource table. When the DATA_SOURCE column is NULL, that reflects that the data is first-party written and managed by your organization. This also includes data that comes from Zusβ EHR integrations.
When the DATA_SOURCE column is not NULL, this reflects that the data is third-party and came from the Zus network. DATA_SOURCE values include:
- EHR networks: commonwell, carequality
- Pharmacy-sourced medications: surescripts
- ADT notifications: bamboohealth, collective-medical, manifest-medex, eden, florida-hie
- Lab notifications: quest
Universal Patient Identifier
Zus offers its own identity mapping service, the Universal Patient Index , to link disparate and duplicative patient records. The universal patient identifier is available as a column called UPID for every base FHIR resource and Lens table types, except for Location, Medication, Practitioner, and Organization.
Every row in the PATIENT table is a FHIR Patient resource that has a unique ID. Each Patient resource corresponds to a different copy of one human being's data that Zus has aggregated. For example, for one human, there could be one Patient resource from Surescripts, one for each responding organization that responded via the Carequality network, and one from Quest. If there are 10 Patient resources that all correspond to the same human, each Patient resource will have a unique ID but the same UPID.
Your data mart will include data for patients for whom your Builder organization has an active treatment relationship.
Exploring the data
To start exploring the data, click on + Create in the left sidebar of Snowsight and select SQL Worksheet.
Let's run a simple query to count the number of distinct patients included in the data mart and click the blue play button on the top right to execute.
select
count(distinct upid)
from patient
π₯³ Congratulations! You completed your first query on the Zus data mart!
Refreshing your data mart
You do not need to actively refresh your data mart. The latest state of FHIR resources should be reflected in your data mart:
- Within 3 hours of the change in the Zus FHIR store: Condition, Device, DocumentReference, Encounter, Location, Organization, Patient, and Practitioner
- Within 24 hours of the change in the Zus FHIR store: AllergyIntolerance, Appointment, CarePlan, CareTeam, Consent, Coverage, DeviceUseStatement, DiagnosticReport, FamilyMemberHistory, Immunization, MedicationAdministration, MedicationDispense, MedicationRequest, MedicationStatement, Observation, Procedure, and Related Person
Note: While Patient resources refresh within 3 hours, their contact information refreshes every 24 hours
Data marts versioning
Zus will update the relational schema to keep up with customer needs and data quality best practices, but will ensure that changes to an existing version of the schema are non-breaking.
If breaking changes are needed, they will be introduced in a new schema version. Each new generally available version will be supported for a minimum of 9 months, and customers will be given at least 60 days to transition to a new version once it is made generally available. Zus reserves the right to introduce a breaking change without notice if it impacts an unused feature of the data mart.
Stay tuned for updates and detailed release notes with each new version. We are committed to providing the best experience and ensuring seamless transitions.
What is a breaking change?
A breaking change in a data mart schema disrupts existing functionality or requires users to alter how they interact with the data mart. These changes can impact data querying, processing, or interpretation, necessitating adjustments to reports, dashboards, or applications relying on the schema.
Examples of breaking changes include:
- Schema modifications:
- Changing the name of an existing column or table
- Removing an existing table or column
- Data type changes: Altering the data type of an existing column (e.g., from integer to string)
- Constraint changes:
- Modifying or removing primary keys
- Modifying or removing foreign key relationships
- Adding or removing unique constraints
Note: Adding a column or table or reordering existing columns within a table does not constitute a breaking change.
Connecting to Snowflake with a keypair
Using a keypair with Snowflake allows users to pair headless tools such as dbt, Terraform, and SnowSQL with their Zus data mart in Snowflake without a password.
Keypair creation
Note: These following steps are for MacOS, but should be portable to other operating systems.
- Choose a passphrase for encrypting your private key. You will be prompted to enter and confirm the passphrase in the following step.
- Create a private key using
openssl
. The following command will create the private key at~/.ssh/snowflake_dev_encrypted.p8
. Feel free to change this filename if you have many SSH keypairs and use a different naming convention.
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM
-out ~/.ssh/snowflake_dev_encrypted.p8
- Create a public key from your private key. You will be prompted again for the passphrase (again, feel free to change the filename to suit your own conventions):
openssl rsa -in ~/.ssh/snowflake_dev_encrypted.p8 -pubout
-out ~/.ssh/snowflake_dev_encrypted.pub
- View the contents of
~/.ssh/snowflake_dev_encrypted.pub
. It should look similar to this:
\-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAp/zsVHB7GebVDueCetuD
GQS7kJODUZDEOYurq/8kjZLPwAF8SUfZobcHQr7TrmK1mTg75nuqDaRV2TdjcThf
SL4nr4duiPzDhjWfAzVBR2tehn0zFs8JXZsj6v5dv11REUCbZAgeJE93UrXLufiJ
dwsqB1YhnlRnFlkVErjv8SXCN1A+g7G1xM1+VtGUMzmhlQZkmhGjH/1y+5yeKtAg
vI7kZAZQGnEQV5y+3H8v4qyJDX7yxd8qnZJ+UpuozRZL0AXE8EYLl+xeveAYy2Vg
ciyRzDVGW4XeD4LKkCYNhVr7Jovs1vyXs+1PPXGCs6YkR2ZrVSDbnvrxauD/ZRbj
FwIDAQAB
-----END PUBLIC KEY-----
- Open a worksheet in Snowflake and run the following query to register your public key with Snowflake. The value of
RSA_PUBLIC_KEY
is the contents of~/.ssh/snowflake_dev_encrypted.pub
without the-----BEGIN PUBLIC KEY----- and -----END PUBLIC KEY-----
markings. (Keeping or removing whitespace between lines appears to not make a difference.)
ALTER USER "<USERNAME>"
SET RSA_PUBLIC_KEY = 'MIIBIjANBgkqh...';
Keypair rotation
Keypairs should be rotated when they reach a certain age, or if you lose your private key.
Each Snowflake user can have up to two keypairs. To add an additional keypair, follow steps 1-5 above, but in step 5 replace RSA_PUBLIC_KEY
with RSA_PUBLIC_KEY_2
. You can then unset RSA_PUBLIC_KEY so that it can be reserved for your next key rotation.
Updated about 1 month ago