2. Administrators Guide

The administration section of this documentation covers topics related to installing the software module in an existing database. We assume general knowledge about PostgreSQL. No particular Unix or programming experience is required.

MGRID HDL is a set of PostgreSQL extension modules that implement the Healthcare Datatypes as a set of user defined types (UDT) and user defined functions (UDF).

MGRID HDM is a collection of Healthcare Data Models that implement the normative HL7v3 Reference Information Models and their associated vocabulary.

2.1. Installing the software

Before the module can be used in a database, the package files must be installed on the host operating system on which the database server software is running.

2.1.1. Requirements

MGRID HDL and HDM 3.0 are only supported on

  • Enterprise Linux (RedHat or CentOS) 6 and 7
  • using PostgreSQL 9.4 or PostgreSQL 9.5 from http://yum.postgresql.org
  • machines with at least 4GB RAM, if Snomed-CT is needed

2.1.2. Adding the PostgreSQL repository

To check which distribution and architecture your machine on, use the following commands:

$ lsb_release -ir
Distributor ID: CentOS
Release:        6.5
$ uname -m
x86_64

In this case the version is CentOS 6 on x86_64. Decide whether to use PostgreSQL 9.4 or 9.5. We will use PostgreSQL 9.4 for this example. Navigate to http://yum.postgresql.org and select the 9.4 release. The next page shows links to the repo packages for 9.4. Copy the link location of the repo package “CentOS 6 - x86_64” and install it with yum.

$ sudo yum install -y http://yum.postgresql.org/9.4/redhat/rhel-6-x86_64/pgdg-centos94-9.4-2.noarch.rpm
...
Dependencies Resolved

================================================================================================================================
 Package                       Arch                   Version                 Repository                                   Size
================================================================================================================================
Installing:
 pgdg-centos94                 noarch                 9.4-2                   /pgdg-centos94-9.4-2.noarch                 2.2 k

Transaction Summary
================================================================================================================================
Install       1 Package(s)

Total size: 2.2 k
Installed size: 2.2 k
...
Complete!

2.1.3. Adding the MGRID repository

  1. Install pygpgme, a package which allows yum to handle gpg signatures, and a package called yum-utils which contains the tools you need for installing source RPMs.

    $ sudo yum install -y pygpgme yum-utils
    ...
    Dependencies Resolved
    
    ================================================================================
     Package            Arch            Version                 Repository     Size
    ================================================================================
    Installing:
     yum-utils          noarch          1.1.30-30.el6           base          110 k
    
    Transaction Summary
    ================================================================================
    Install       1 Package(s)
    
    Total size: 110 k
    Installed size: 307 k
    ...
    Complete!
    
  2. Create a file named /etc/yum.repos.d/mgrid_mgrid3.repo that contains the repository configuration. This is a private repository, so you’ll need a read token to access it. Use the command below to create a read token and output a repository configuration. Replace UNIQUE_ID with any unique identifier for your system, such as the hostname. Replace the master token with the token you’ve received from support@mgrid.net. Replace os and dist with a valid Linux distribution and version, in our example case ‘el’, version 6.

    UNIQUE_ID=`hostname -f` && \
    MASTER_TOKEN=aabbccddeeff00112233445566778899aabbccddeeff0011 && \
    OS=el && \
    DIST=6 && \
    curl "https://${MASTER_TOKEN}:@packagecloud.io/install/repositories/mgrid/mgrid3/config_file.repo?os=${OS}&dist=${DIST}&name=${UNIQUE_ID}"
    

    Place the output from the above command in a file named /etc/yum.repos.d/mgrid_mgrid3.repo.

  3. Update your local yum cache by running

    $ sudo yum -q makecache -y --disablerepo='*' --enablerepo='mgrid_mgrid3'
    Importing GPG key 0xD59097AB:
     Userid: "packagecloud ops (production key) <ops@packagecloud.io>"
     From  : https://packagecloud.io/gpg.key
    
  4. Verify that packages are available. Note that on first access of the repository we must accept the GPG key.

    $ sudo yum search mgrid
    Loaded plugins: fastestmirror
    Loading mirror speeds from cached hostfile
     * base: mirror.amsiohosting.net
     * extras: mirror.1000mbps.com
     * updates: mirror.amsiohosting.net
    mgrid_mgrid3-source/signature                                                                                        |  836 B     00:00
    Retrieving key from https://packagecloud.io/gpg.key
    Importing GPG key 0xD59097AB:
     Userid: "packagecloud ops (production key) <ops@packagecloud.io>"
     From  : https://packagecloud.io/gpg.key
    Is this ok [y/N]: y
    ============================================================ N/S Matched: mgrid ============================================================
    mgridhdl3_94.x86_64 : MGRID Healthcare Datatype Library
    mgridhdl3_95.x86_64 : MGRID Healthcare Datatype Library
    mgridhdm3-hl7v3models_edition2006_94.noarch : MGRID HDM HL7v3 RIM, Vocabulary and CRUD functions. Normative Edition 2006
    mgridhdm3-hl7v3models_edition2006_95.noarch : MGRID HDM HL7v3 RIM, Vocabulary and CRUD functions. Normative Edition 2006
    mgridhdm3-hl7v3models_edition2008_94.noarch : MGRID HDM HL7v3 RIM, Vocabulary and CRUD functions. Normative Edition 2008
    mgridhdm3-hl7v3models_edition2008_95.noarch : MGRID HDM HL7v3 RIM, Vocabulary and CRUD functions. Normative Edition 2008
    ...
    

2.1.4. Installing the packages

Install MGRID Healthcare Datatype Library and PostgreSQL 9.4 with the following command. We include PostgreSQL header files, additional contrib modules and the plpython language.

$ sudo yum install -y mgridhdl3_94 postgresql94-devel postgresql94-contrib postgresql94-plpython
...
Dependencies Resolved

================================================================================
 Package                 Arch     Version                  Repository      Size
================================================================================
Installing:
 mgridhdl3_94            x86_64   3.0-449                  mgrid_mgrid3   431 k
 postgresql94-contrib    x86_64   9.4.5-1PGDG.rhel6        pgdg94         529 k
 postgresql94-devel      x86_64   9.4.5-1PGDG.rhel6        pgdg94         1.5 M
 postgresql94-plpython   x86_64   9.4.5-1PGDG.rhel6        pgdg94          67 k
Installing for dependencies:
 hwdata                  noarch   0.233-14.1.el6           base           1.3 M
 initscripts             x86_64   9.03.49-1.el6.centos.3   updates        945 k
 iproute                 x86_64   2.6.32-45.el6            base           367 k
 iptables                x86_64   1.4.7-16.el6             base           254 k
 iputils                 x86_64   20071127-20.el6          base           120 k
 libsemanage             x86_64   2.0.43-5.1.el6           base           104 k
 libxslt                 x86_64   1.1.26-2.el6_3.1         base           452 k
 mingetty                x86_64   1.08-5.el6               base            21 k
 module-init-tools       x86_64   3.9-25.el6               base           466 k
 policycoreutils         x86_64   2.0.83-24.el6            base           651 k
 postgresql94            x86_64   9.4.5-1PGDG.rhel6        pgdg94         1.0 M
 postgresql94-libs       x86_64   9.4.5-1PGDG.rhel6        pgdg94         194 k
 postgresql94-server     x86_64   9.4.5-1PGDG.rhel6        pgdg94         4.3 M
 psmisc                  x86_64   22.6-19.el6_5            base            81 k
 sysvinit-tools          x86_64   2.87-6.dsf.el6           base            60 k
 udev                    x86_64   147-2.63.el6_7.1         updates        355 k
 upstart                 x86_64   0.6.5-13.el6_5.3         base           177 k
 util-linux-ng           x86_64   2.17.2-12.18.el6         base           1.6 M

Transaction Summary
================================================================================
Install      22 Package(s)

Total download size: 15 M
Installed size: 62 M
...
Complete!

Next we need to choose RIM models and vocabulary files. The next example shows how to install Loinc 2.50, Snomed-CT 2014 and HL7v3 RIM and Vocabulary files for Normative Edition 2011 and 2013 in the PostgreSQL share directory for version 9.4. Note that this step only places extension files on disk for possible usage in databases. Section Creating a database describes how to use the vocabulary extensions.

$ sudo yum install -y mgridhdm3-loinc_250_94 \
    mgridhdm3-snomedctvocab_20140131_94 \
    mgridhdm3-hl7v3models_edition2011_94 \
    mgridhdm3-hl7v3models_edition2013_94
...
Dependencies Resolved

================================================================================================================================================================================
 Package                                                        Arch                             Version                           Repository                              Size
================================================================================================================================================================================
Installing:
 mgridhdm3-hl7v3models_edition2011_94                           noarch                           3.0-210                           mgrid_mgrid3                           418 k
 mgridhdm3-hl7v3models_edition2013_94                           noarch                           3.0-210                           mgrid_mgrid3                           425 k
 mgridhdm3-loinc_250_94                                         noarch                           3.0-210                           mgrid_mgrid3                           1.5 M
 mgridhdm3-snomedctvocab_20140131_94                            noarch                           3.0-210                           mgrid_mgrid3                            26 M

Transaction Summary
================================================================================================================================================================================
Install       4 Package(s)

Total download size: 28 M
Installed size: 95 M
...
Complete!

Once this is complete, we can start creating PostgreSQL databases with MGRID HDL and HDM extensions.

2.1.5. Dockerfile install repositories and packages

The following Dockerfile provides a summary of the previous sections.

FROM centos:6

RUN yum install -y pygpgme yum-utils

RUN yum install -y http://yum.postgresql.org/9.4/redhat/rhel-6-x86_64/pgdg-centos94-9.4-2.noarch.rpm

RUN UNIQUE_ID=`hostname -f` && \
    MASTER_TOKEN=aabbccddeeff00112233445566778899aabbccddeeff0011 && OS=el && \
    DIST=6 && curl \
    "https://${MASTER_TOKEN}:@packagecloud.io/install/repositories/mgrid/mgrid3/config_file.repo?os=${OS}&dist=${DIST}&name=${UNIQUE_ID}" \
    > /etc/yum.repos.d/mgrid_mgrid3.repo > /etc/yum.repos.d/mgrid_mgrid3.repo

RUN yum -q makecache -y --disablerepo='*' --enablerepo='mgrid_mgrid3'

RUN yum install -y mgridhdl3_94 postgresql94-devel postgresql94-contrib postgresql94-plpython

RUN yum install -y mgridhdm3-loinc_250_94 \
      mgridhdm3-snomedctvocab_20140131_94 \
      mgridhdm3-hl7v3models_edition2011_94 \
      mgridhdm3-hl7v3models_edition2013_94

RUN ln -s /usr/pgsql-9.4/lib /usr/lib/pgsql && \
    ln -s /usr/pgsql-9.4/include /usr/include/pgsql && \
    service postgresql-9.4 initdb && \
    sed -i '/^local/ s/peer/trust/; /^host/ s/ident/trust/' /var/lib/pgsql/9.4/data/pg_hba.conf && \
    sed -i 's/^#\(listen_addresses\)/\1/' /var/lib/pgsql/9.4/data/postgresql.conf && \
    service postgresql-9.4 start

Note that the last RUN command initializes and configures the PostgreSQL 9.4 cluster. The configuration performed in this step is for demonstration purposes only, and it is advised to consult a PostgreSQL administration expert to configure your PostgreSQL server.

2.2. Creating a database

Creating a database with the MGRID Healthcare Datatype Library and MGRID Healthcare Data Model on PostgreSQL servers 9.4 or higher can be done by creating extensions. The extensions check their requirements, which prevents common mistakes such as loading a RIM without the necessary vocabulary. With extensions it is also easy to keep track which modules are loaded, as is shown in How do I view which healthcare modules and vocabularies are loaded?.

2.2.1. Create a database on PostgreSQL (≥ 9.4)

We create a database user hcuser and database lake owned by hcuser. Initially, hcuser must be superuser, to create the extensions that contain functions implemented in the C language. The superuser right will be revoked after the extensions are loaded.

$ createuser -U postgres --superuser --createdb --login hcuser
$ createdb -U hcuser lake
$ psql -U hcuser lake
psql (9.4.5)
Type "help" for help.

lake=#

2.2.2. Loading MGRID extensions in a database

The next commands load the extensions required for the RIM2011 database.

lake=#    ALTER DATABASE lake SET search_path=public,hdl,hl7,r1,"$user";
ALTER DATABASE
lake=#     CREATE EXTENSION hl7basetable;
CREATE EXTENSION
lake=#     CREATE EXTENSION ucum;
CREATE EXTENSION
lake=#     CREATE EXTENSION hl7;
CREATE EXTENSION
lake=#     CREATE EXTENSION hl7v3vocab_edition2011;
CREATE EXTENSION
lake=#     CREATE EXTENSION hl7v3datatypes;
CREATE EXTENSION
lake=#     CREATE EXTENSION hl7v3rim_edition2011;
WARNING:  ========================================================
WARNING:  RIM tables have been detached from hl7v3rim_edition2011.
WARNING:  ========================================================
INFO:  Dropping extension hl7v3rim_edition2011 will not drop the RIM tables.
INFO:  RIM tables can now be exported by pg_dump.
CREATE EXTENSION
lake=#     CREATE EXTENSION hl7v3crud_edition2011;
CREATE EXTENSION
lake=#     ALTER ROLE hcuser NOSUPERUSER;
ALTER ROLE
  • hl7basetable - this extension creates catalog support tables used by the hl7 and ucum extensions, and must be loaded before all the other extensions.
  • ucum - this extension creates the tables and conversion support functions for UCUM, the Unified Codes for Units of Measure. This extension is a prerequisite for the hl7 extensions PQ datatype.
  • hl7 - this extension contains the base types, functions and operators.
  • hl7v3vocab_edition2011 - vocabulary extensions populate the catalog support relations for the CV datatype with the HL7v3 codesystems. MGRID includes vocabulary extensions for all HL7v3 Normative Editions. It must be loaded before the datatype extensions as well as before any database that creates table columns with a conceptdomain, e.g. a RIM database with matching Normative Edition.
  • hl7v3datatypes - this extension loads the HDL datatypes.

At this point no tables are created yet; the database has datatypes and vocabulary support only. This can be used as starting point for a custom database that uses the datatypes, such as a clinical research datawarehouse schema. Another possibility is to proceed creating a full RIM database, by loading the following extensions:

  • hl7v3rim_edition2011 - this extension creates tables that match the RIM of Normative Edition 2011. The rim script version should match the vocabulary version, loading e.g a RIM from 2011 on a vocabulary script from 2005 will give errors due to unknown conceptdomains. The search_path should include r1 or r2 to select the complex types version used in the selected RIM version.
  • hl7v3crud_edition2011 - this extension creates basic insert functions for the RIM classes, and is used by the parsers generated by the MGRID Messaging Toolkit and other additional software available from MGRID.

Note

Use the supplied RIM or modify a standard RIM with sed

The hl7v3rim_edition<year> extension will create a number of tables, such as Act and Observation, that implement the a minimal RIM. If you need additional columns in these tables, do not create the tables by loading the extension, but rather create a sed script that reads the extension sql file, adds additional columns at the appropriate places, and feed that to psql. This method is used in the AXLE Healthcare Benchmark tutorial, where the RIM is augmented with timestamps and other attributes for entity resolution.

Use can use \dx to inspect which modules are loaded.

lake=# \dx
                                        List of installed extensions
          Name          | Version |   Schema   |                        Description
------------------------+---------+------------+------------------------------------------------------------
 adminpack              | 1.0     | pg_catalog | administrative functions for PostgreSQL
 hl7                    | 3.0     | hl7        | healthcare datatypes core
 hl7basetable           | 3.0     | hdl        | healthcare datatypes base tables
 hl7v3crud_edition2011  | 2.0     | public     | hl7v3 edition2011 CRUD functions
 hl7v3datatypes         | 3.0     | hl7        | healthcare datatypes
 hl7v3rim_edition2011   | 2.0     | public     | hl7v3 edition2011 RIM schema
 hl7v3vocab_edition2011 | 2.0     | hdl        | hl7v3 edition2011 vocabulary
 jsquery                | 1.0     | public     | data type for jsonb inspection
 loinc_2_50             | 1.0     | public     | loinc 2.50 vocabulary
 plpgsql                | 1.0     | pg_catalog | PL/pgSQL procedural language
 snomedctvocab_20140131 | 1.0     | public     | snomed-ct 20140131 vocabulary
 tablefunc              | 1.0     | public     | functions that manipulate whole tables, including crosstab
 ucum                   | 3.0     | hdl        | Unified code for units of measure v1.8.2
(13 rows)

2.2.3. Loading additional useful extensions

The following extensions are useful for handling JSON types, making pivots and accessing the database using pgadmin3, respectively.

lake=# create extension jsquery;
CREATE EXTENSION
lake=# create extension tablefunc;
CREATE EXTENSION
lake=# create extension adminpack;
CREATE EXTENSION

2.2.4. Dockerfile create RIM database

The following Docker RUN command contains all the steps performed above. It can be appended to the Dockerfile described in section Dockerfile install repositories and packages. In addition it also loads the Snomed-CT and Loinc vocabularies that were installed in section Installing the packages, but not yet used.

RUN service postgresql-9.4 start && \
    createuser -U postgres --superuser --login --createdb hcuser && \
    createdb -U hcuser lake && \
    psql -U hcuser -d lake -c 'CREATE EXTENSION hl7basetable' && \
    psql -U hcuser -d lake -c 'CREATE EXTENSION ucum' && \
    psql -U hcuser -d lake -c 'CREATE EXTENSION hl7' && \
    psql -U hcuser -d lake -c 'CREATE EXTENSION hl7v3vocab_edition2011' && \
    psql -U hcuser -d lake -c 'CREATE EXTENSION hl7v3datatypes' && \
    psql -U hcuser -d lake -c 'CREATE EXTENSION hl7v3rim_edition2011' && \
    psql -U hcuser -d lake -c 'CREATE EXTENSION hl7v3crud_edition2011' && \
    psql -U hcuser -d lake -c 'ALTER DATABASE LAKE SET search_path=public,hdl,hl7,r1,"$user"' && \
# more vocabulary
# note: loading the Snomed-CT extension required >= 4GB ram!
    psql -U hcuser -d lake -c 'CREATE EXTENSION snomedctvocab_20140131' && \
    psql -U hcuser -d lake -c 'CREATE EXTENSION loinc_2_50' && \
# additional useful modules
    psql -U hcuser -d lake -c 'CREATE EXTENSION jsquery' && \
    psql -U hcuser -d lake -c 'CREATE EXTENSION tablefunc' && \
    psql -U hcuser -d lake -c 'CREATE EXTENSION adminpack'

2.3. Backup and Restore

The official PostgreSQL backup and restore documentation describes three ways to backup databases.

  1. SQL Dump. MGRID HDM databases require special handling for performing SQL dumps and restore. This is described below.
  2. File System Level Backup. MGRID HDM requires no special handling for file system level backups. Please refer to PostgreSQL file system level backup for more information.
  3. Continuous Archiving and Point-in-Time Recovery. MGRID HDM requires no special handling for this kind of backup. Please refer to PostgreSQL Continuous Archiving for more information.

2.3.1. SQL dump and restore

It is not possible to perform a regular pg_dump and pg_restore of a MGRID HDM database, due to the way the extensions are organized. Consider the following scenario:

$ pg_dump -U hcuser -d lake > lake.dmp
$ createdb -U hcuser test
$ psql -U hcuser aap < lake.dmp
SET
SET
SET
SET
SET
SET
SET
CREATE SCHEMA
ALTER SCHEMA
CREATE SCHEMA
ALTER SCHEMA
CREATE EXTENSION
COMMENT
CREATE EXTENSION
COMMENT
CREATE EXTENSION
COMMENT
ERROR:  type NullFlavor does not exist
ERROR:  extension "hl7v3datatypes" does not exist

...

The reason for this error is that the dump contains the CREATE EXTENSION commands in this wrong order:

CREATE EXTENSION IF NOT EXISTS hl7basetable WITH SCHEMA hdl;
CREATE EXTENSION IF NOT EXISTS ucum WITH SCHEMA hdl;
CREATE EXTENSION IF NOT EXISTS hl7 WITH SCHEMA hl7;
CREATE EXTENSION IF NOT EXISTS hl7v3datatypes WITH SCHEMA hl7;
CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;
CREATE EXTENSION IF NOT EXISTS adminpack WITH SCHEMA pg_catalog;
CREATE EXTENSION IF NOT EXISTS hl7v3vocab_edition2011 WITH SCHEMA hdl;
CREATE EXTENSION IF NOT EXISTS hl7v3rim_edition2011 WITH SCHEMA public;
CREATE EXTENSION IF NOT EXISTS hl7v3crud_edition2011 WITH SCHEMA public;
CREATE EXTENSION IF NOT EXISTS jsquery WITH SCHEMA public;
CREATE EXTENSION IF NOT EXISTS loinc_2_50 WITH SCHEMA public;
CREATE EXTENSION IF NOT EXISTS snomedctvocab_20140131 WITH SCHEMA public;
CREATE EXTENSION IF NOT EXISTS tablefunc WITH SCHEMA public;

The order is wrong since hl7v3datatypes requires vocabulary that is loaded by any hl7v3vocab_edition<year> extension. The best solution to state such a dependency is to add hl7v3vocab_edition2011 to the hl7v3datatypes control file, but this cannot be done since there are a number of alternatives for the vocabulary, and we do not want to fix the hl7v3datatypes extension to a specific normative edition.

Therefore, MGRID HDM databases can only be dumped and restored by performing data-only restores into databases that are created according to the original database definitions.

2.3.1.1. Data-only SQL dump

The example below shows how to perform a data only dump of the database lake to the file lake.dmp.

pg_dump --data-only --no-privileges --no-owner \
    -U hcuser lake \
     | sed -e '/SET search_path/ s/;/, hl7, hdl, r1;/' > lake.dmp

Note

Update the search path to the correct R1 or R2 schema.

The example above fixes the search_path during restore for databases with R1 datatypes. Change this into r2 to restore databases that use the R2 datatypes.

2.3.1.2. Restore of data-only SQL dump

Restore of a HDM database consists of two steps:

  1. Create a database that matches the extensions used for the original database. See section Creating a database for more information.

  2. Use the following command to restore the data-only dump into the new database:

    cat lake.dmp | psql -U hcuser --single-transaction -v ON_ERROR_STOP=true lake
    

2.4. Upgrade

MGRID provides upgrades through RPM package updates. We make the distinction between minor updates and major updates. A minor update is an update where the first two numbers in the version number stay the same. All other updates are major updates.

2.4.1. Minor HDL updates

A minor update of the MGRID HDL package will place a new version of the hl7.so shared library in the PostgreSQL shared library directory. The PostgreSQL server does not need to be shutdown before this update is performed, nor is it strictly required to restart the PostgreSQL server after a minor update. Running PostgreSQL backends (sessions) will still use the old version, and new sessions with the database will use the new, updated version.

Minor updates will contain bug fixes, but do not alter API’s or usage of existing functions, and can thus be applied to systems in production.

$ sudo yum update mgridhdl3_95
...
Dependencies Resolved

========================================================================================================
 Package                    Arch                 Version               Repository                  Size
========================================================================================================
Updating:
 mgridhdl3_95               x86_64               3.0.4-1               mgrid_mgrid3               432 k

Transaction Summary
========================================================================================================
Upgrade  1 Package

Total download size: 432 k
Is this ok [y/d/N]: y
...
Updated:
  mgridhdl3_95.x86_64 0:3.0.4-1

Complete!

2.4.2. Major HDL updates

A major update of the MGRID HDL package will require updating the mgridhdl3 package first, as described in the previous section, so the new binaries are installed on the server. Then in each database that has uses the hl7 extension, perform the update with the following command:

ALTER EXTENSION hl7 UPDATE TO 3.1;
ALTER EXTENSION ucum UPDATE TO 3.1;
-- etc