Sustained performance, availability and security for enterprise clinical data management

Overview text

An introduction to the MGRID clinical data platform

Medical Features text

What makes MGRID the ultimate clinical data repository

Grid Features text

Benefits of cloud-compatible grid computing

Partners text

Services and system integration

About Us text

Contact information and investors

Parse, persist and query CDA documents

We presented our latest addition to the MGRID platform at the november 2011 HL7 RIMBAA international meeting. Our talk about persisting HL7 CDA documents was preceded by some interesting talks about a RIM-based application in Finland and a Drug Information system in Canada.

These two talks clearly demonstrated a number of common themes for RIM-based applications:

  • a database based on the RIM is futureproof; it allows for a database that can deal with new types of information without changing the database structure
  • trouble of mapping the rich HL7 datatypes to the local implementation
  • a database based on the HL7 RIM can get very large and will have a number of hot tables

In our presentation we did a short recap of our previous RIMBAA presentations. The MGRID take on the HL7 datatypes is to make them available as native types in the database. This approach has the advantage that you don't have to do the cumbersome mappings to a set of basic datatypes (int, string etc.). An additional bonus of our way is that you can create indexes and speed up your application.

Next on the list of RIM persistence issues is performance. In a growing dataset the linking tables such as ActRelationship tend to get very large and hot (= hit a lot by queries). In the MGRID platform we deal with this problems in two ways:

  • First we perform automatic context conduction. This means that the context of a document, for example a participation between a document and patient, is pushed down to all underlying sections and observations in that document. The advantage of this is the possibility to link the observations in the document with the patient without traversing a number of ActRelionship tables. In short: the number of joins decrease and there is less need to hit the ActRelionship table.
  • Second we shard large tables in a smart manner over a number of small databases. This results in joins between small(er) tables instead of huge ones.

Combining both approaches mean that it is possible to reach good performance figures and be able to keep satisfactory performance while the dataset grows.

The latest addition to the MGRID platform is a message parser. HL7 is a messaging standard, so we don't want to persist just acts, roles and entities. We want to be able to persist messages and perform queries on the contents that they hold. The first parser we wrote is a CDA message parser. It can parse messages conforming to the HL7 CDA R2 standard and is for the most part generated from the HL7 standard files (MIFs). We tested the CDA parser with documents from our partner TIANI Spirit and showed that it is possible to run queries and reports on the information in these documents.

Since the CDA parser is mostly generated from the HL7 MIFs, building a parser for another message type is not that hard. In fact, we already built one for dealing with carestatements too.