2. Administrators Guide¶
The administration section of this documentation covers topics related to installing the software module in an existing database. We assume general knowledge about PostgreSQL. No particular Unix or programming experience is required.
MGRID HDL is a set of PostgreSQL extension modules that implement the Healthcare Datatypes as a set of user defined types (UDT) and user defined functions (UDF).
MGRID HDM is a collection of Healthcare Data Models that implement the normative HL7v3 Reference Information Models and their associated vocabulary.
2.1. Installing the software¶
Before the module can be used in a database, the package files must be installed on the host operating system on which the database server software is running.
2.1.1. Requirements¶
MGRID HDL and HDM 3.0 are only supported on
- Enterprise Linux (RedHat or CentOS) 6 and 7
- using PostgreSQL 9.4 or PostgreSQL 9.5 from http://yum.postgresql.org
- machines with at least 4GB RAM, if Snomed-CT is needed
2.1.2. Adding the PostgreSQL repository¶
To check which distribution and architecture your machine on, use the following commands:
$ lsb_release -ir
Distributor ID: CentOS
Release: 6.5
$ uname -m
x86_64
In this case the version is CentOS 6 on x86_64. Decide whether to use PostgreSQL 9.4 or 9.5. We will use PostgreSQL 9.4 for this example. Navigate to http://yum.postgresql.org and select the 9.4 release. The next page shows links to the repo packages for 9.4. Copy the link location of the repo package “CentOS 6 - x86_64” and install it with yum.
$ sudo yum install -y http://yum.postgresql.org/9.4/redhat/rhel-6-x86_64/pgdg-centos94-9.4-2.noarch.rpm
...
Dependencies Resolved
================================================================================================================================
Package Arch Version Repository Size
================================================================================================================================
Installing:
pgdg-centos94 noarch 9.4-2 /pgdg-centos94-9.4-2.noarch 2.2 k
Transaction Summary
================================================================================================================================
Install 1 Package(s)
Total size: 2.2 k
Installed size: 2.2 k
...
Complete!
2.1.3. Adding the MGRID repository¶
Install
pygpgme
, a package which allows yum to handle gpg signatures, and a package calledyum-utils
which contains the tools you need for installing source RPMs.$ sudo yum install -y pygpgme yum-utils ... Dependencies Resolved ================================================================================ Package Arch Version Repository Size ================================================================================ Installing: yum-utils noarch 1.1.30-30.el6 base 110 k Transaction Summary ================================================================================ Install 1 Package(s) Total size: 110 k Installed size: 307 k ... Complete!
Create a file named
/etc/yum.repos.d/mgrid_mgrid3.repo
that contains the repository configuration. This is a private repository, so you’ll need a read token to access it. Use the command below to create a read token and output a repository configuration. ReplaceUNIQUE_ID
with any unique identifier for your system, such as the hostname. Replace the master token with the token you’ve received from support@mgrid.net. Replaceos
anddist
with a valid Linux distribution and version, in our example case ‘el’, version 6.UNIQUE_ID=`hostname -f` && \ MASTER_TOKEN=aabbccddeeff00112233445566778899aabbccddeeff0011 && \ OS=el && \ DIST=6 && \ curl "https://${MASTER_TOKEN}:@packagecloud.io/install/repositories/mgrid/mgrid3/config_file.repo?os=${OS}&dist=${DIST}&name=${UNIQUE_ID}"
Place the output from the above command in a file named
/etc/yum.repos.d/mgrid_mgrid3.repo
.Update your local yum cache by running
$ sudo yum -q makecache -y --disablerepo='*' --enablerepo='mgrid_mgrid3' Importing GPG key 0xD59097AB: Userid: "packagecloud ops (production key) <ops@packagecloud.io>" From : https://packagecloud.io/gpg.key
Verify that packages are available. Note that on first access of the repository we must accept the GPG key.
$ sudo yum search mgrid Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: mirror.amsiohosting.net * extras: mirror.1000mbps.com * updates: mirror.amsiohosting.net mgrid_mgrid3-source/signature | 836 B 00:00 Retrieving key from https://packagecloud.io/gpg.key Importing GPG key 0xD59097AB: Userid: "packagecloud ops (production key) <ops@packagecloud.io>" From : https://packagecloud.io/gpg.key Is this ok [y/N]: y ============================================================ N/S Matched: mgrid ============================================================ mgridhdl3_94.x86_64 : MGRID Healthcare Datatype Library mgridhdl3_95.x86_64 : MGRID Healthcare Datatype Library mgridhdm3-hl7v3models_edition2006_94.noarch : MGRID HDM HL7v3 RIM, Vocabulary and CRUD functions. Normative Edition 2006 mgridhdm3-hl7v3models_edition2006_95.noarch : MGRID HDM HL7v3 RIM, Vocabulary and CRUD functions. Normative Edition 2006 mgridhdm3-hl7v3models_edition2008_94.noarch : MGRID HDM HL7v3 RIM, Vocabulary and CRUD functions. Normative Edition 2008 mgridhdm3-hl7v3models_edition2008_95.noarch : MGRID HDM HL7v3 RIM, Vocabulary and CRUD functions. Normative Edition 2008 ...
2.1.4. Installing the packages¶
Install MGRID Healthcare Datatype Library and PostgreSQL 9.4 with the following command. We include PostgreSQL header files, additional contrib modules and the plpython language.
$ sudo yum install -y mgridhdl3_94 postgresql94-devel postgresql94-contrib postgresql94-plpython
...
Dependencies Resolved
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
mgridhdl3_94 x86_64 3.0-449 mgrid_mgrid3 431 k
postgresql94-contrib x86_64 9.4.5-1PGDG.rhel6 pgdg94 529 k
postgresql94-devel x86_64 9.4.5-1PGDG.rhel6 pgdg94 1.5 M
postgresql94-plpython x86_64 9.4.5-1PGDG.rhel6 pgdg94 67 k
Installing for dependencies:
hwdata noarch 0.233-14.1.el6 base 1.3 M
initscripts x86_64 9.03.49-1.el6.centos.3 updates 945 k
iproute x86_64 2.6.32-45.el6 base 367 k
iptables x86_64 1.4.7-16.el6 base 254 k
iputils x86_64 20071127-20.el6 base 120 k
libsemanage x86_64 2.0.43-5.1.el6 base 104 k
libxslt x86_64 1.1.26-2.el6_3.1 base 452 k
mingetty x86_64 1.08-5.el6 base 21 k
module-init-tools x86_64 3.9-25.el6 base 466 k
policycoreutils x86_64 2.0.83-24.el6 base 651 k
postgresql94 x86_64 9.4.5-1PGDG.rhel6 pgdg94 1.0 M
postgresql94-libs x86_64 9.4.5-1PGDG.rhel6 pgdg94 194 k
postgresql94-server x86_64 9.4.5-1PGDG.rhel6 pgdg94 4.3 M
psmisc x86_64 22.6-19.el6_5 base 81 k
sysvinit-tools x86_64 2.87-6.dsf.el6 base 60 k
udev x86_64 147-2.63.el6_7.1 updates 355 k
upstart x86_64 0.6.5-13.el6_5.3 base 177 k
util-linux-ng x86_64 2.17.2-12.18.el6 base 1.6 M
Transaction Summary
================================================================================
Install 22 Package(s)
Total download size: 15 M
Installed size: 62 M
...
Complete!
Next we need to choose RIM models and vocabulary files. The next example shows how to install Loinc 2.50, Snomed-CT 2014 and HL7v3 RIM and Vocabulary files for Normative Edition 2011 and 2013 in the PostgreSQL share directory for version 9.4. Note that this step only places extension files on disk for possible usage in databases. Section Creating a database describes how to use the vocabulary extensions.
$ sudo yum install -y mgridhdm3-loinc_250_94 \
mgridhdm3-snomedctvocab_20140131_94 \
mgridhdm3-hl7v3models_edition2011_94 \
mgridhdm3-hl7v3models_edition2013_94
...
Dependencies Resolved
================================================================================================================================================================================
Package Arch Version Repository Size
================================================================================================================================================================================
Installing:
mgridhdm3-hl7v3models_edition2011_94 noarch 3.0-210 mgrid_mgrid3 418 k
mgridhdm3-hl7v3models_edition2013_94 noarch 3.0-210 mgrid_mgrid3 425 k
mgridhdm3-loinc_250_94 noarch 3.0-210 mgrid_mgrid3 1.5 M
mgridhdm3-snomedctvocab_20140131_94 noarch 3.0-210 mgrid_mgrid3 26 M
Transaction Summary
================================================================================================================================================================================
Install 4 Package(s)
Total download size: 28 M
Installed size: 95 M
...
Complete!
Once this is complete, we can start creating PostgreSQL databases with MGRID HDL and HDM extensions.
2.1.5. Dockerfile install repositories and packages¶
The following Dockerfile provides a summary of the previous sections.
FROM centos:6
RUN yum install -y pygpgme yum-utils
RUN yum install -y http://yum.postgresql.org/9.4/redhat/rhel-6-x86_64/pgdg-centos94-9.4-2.noarch.rpm
RUN UNIQUE_ID=`hostname -f` && \
MASTER_TOKEN=aabbccddeeff00112233445566778899aabbccddeeff0011 && OS=el && \
DIST=6 && curl \
"https://${MASTER_TOKEN}:@packagecloud.io/install/repositories/mgrid/mgrid3/config_file.repo?os=${OS}&dist=${DIST}&name=${UNIQUE_ID}" \
> /etc/yum.repos.d/mgrid_mgrid3.repo > /etc/yum.repos.d/mgrid_mgrid3.repo
RUN yum -q makecache -y --disablerepo='*' --enablerepo='mgrid_mgrid3'
RUN yum install -y mgridhdl3_94 postgresql94-devel postgresql94-contrib postgresql94-plpython
RUN yum install -y mgridhdm3-loinc_250_94 \
mgridhdm3-snomedctvocab_20140131_94 \
mgridhdm3-hl7v3models_edition2011_94 \
mgridhdm3-hl7v3models_edition2013_94
RUN ln -s /usr/pgsql-9.4/lib /usr/lib/pgsql && \
ln -s /usr/pgsql-9.4/include /usr/include/pgsql && \
service postgresql-9.4 initdb && \
sed -i '/^local/ s/peer/trust/; /^host/ s/ident/trust/' /var/lib/pgsql/9.4/data/pg_hba.conf && \
sed -i 's/^#\(listen_addresses\)/\1/' /var/lib/pgsql/9.4/data/postgresql.conf && \
service postgresql-9.4 start
Note that the last RUN command initializes and configures the PostgreSQL 9.4 cluster. The configuration performed in this step is for demonstration purposes only, and it is advised to consult a PostgreSQL administration expert to configure your PostgreSQL server.
2.2. Creating a database¶
Creating a database with the MGRID Healthcare Datatype Library and MGRID Healthcare Data Model on PostgreSQL servers 9.4 or higher can be done by creating extensions. The extensions check their requirements, which prevents common mistakes such as loading a RIM without the necessary vocabulary. With extensions it is also easy to keep track which modules are loaded, as is shown in How do I view which healthcare modules and vocabularies are loaded?.
2.2.1. Create a database on PostgreSQL (≥ 9.4)¶
We create a database user hcuser
and database lake
owned by hcuser
. Initially, hcuser
must be superuser,
to create the extensions that contain functions implemented in the C language. The superuser right will be revoked after
the extensions are loaded.
$ createuser -U postgres --superuser --createdb --login hcuser
$ createdb -U hcuser lake
$ psql -U hcuser lake
psql (9.4.5)
Type "help" for help.
lake=#
2.2.2. Loading MGRID extensions in a database¶
The next commands load the extensions required for the RIM2011 database.
lake=# ALTER DATABASE lake SET search_path=public,hdl,hl7,r1,"$user";
ALTER DATABASE
lake=# CREATE EXTENSION hl7basetable;
CREATE EXTENSION
lake=# CREATE EXTENSION ucum;
CREATE EXTENSION
lake=# CREATE EXTENSION hl7;
CREATE EXTENSION
lake=# CREATE EXTENSION hl7v3vocab_edition2011;
CREATE EXTENSION
lake=# CREATE EXTENSION hl7v3datatypes;
CREATE EXTENSION
lake=# CREATE EXTENSION hl7v3rim_edition2011;
WARNING: ========================================================
WARNING: RIM tables have been detached from hl7v3rim_edition2011.
WARNING: ========================================================
INFO: Dropping extension hl7v3rim_edition2011 will not drop the RIM tables.
INFO: RIM tables can now be exported by pg_dump.
CREATE EXTENSION
lake=# CREATE EXTENSION hl7v3crud_edition2011;
CREATE EXTENSION
lake=# ALTER ROLE hcuser NOSUPERUSER;
ALTER ROLE
hl7basetable
- this extension creates catalog support tables used by the hl7 and ucum extensions, and must be loaded before all the other extensions.ucum
- this extension creates the tables and conversion support functions for UCUM, the Unified Codes for Units of Measure. This extension is a prerequisite for thehl7
extensionsPQ
datatype.hl7
- this extension contains the base types, functions and operators.hl7v3vocab_edition2011
- vocabulary extensions populate the catalog support relations for theCV
datatype with the HL7v3 codesystems. MGRID includes vocabulary extensions for all HL7v3 Normative Editions. It must be loaded before the datatype extensions as well as before any database that creates table columns with a conceptdomain, e.g. a RIM database with matching Normative Edition.hl7v3datatypes
- this extension loads the HDL datatypes.
At this point no tables are created yet; the database has datatypes and vocabulary support only. This can be used as starting point for a custom database that uses the datatypes, such as a clinical research datawarehouse schema. Another possibility is to proceed creating a full RIM database, by loading the following extensions:
hl7v3rim_edition2011
- this extension creates tables that match the RIM of Normative Edition 2011. The rim script version should match the vocabulary version, loading e.g a RIM from 2011 on a vocabulary script from 2005 will give errors due to unknown conceptdomains. Thesearch_path
should includer1
orr2
to select the complex types version used in the selected RIM version.hl7v3crud_edition2011
- this extension creates basic insert functions for the RIM classes, and is used by the parsers generated by the MGRID Messaging Toolkit and other additional software available from MGRID.
Note
Use the supplied RIM or modify a standard RIM with sed
The hl7v3rim_edition<year>
extension will create a number of tables, such as Act
and Observation
, that
implement the a minimal RIM. If you need additional columns in these tables, do not create the tables by loading the
extension, but rather create a sed
script that reads the extension sql file, adds additional columns at the
appropriate places, and feed that to psql
. This method is used in the AXLE Healthcare Benchmark
tutorial, where the RIM is augmented with timestamps and other attributes for entity resolution.
Use can use \dx
to inspect which modules are loaded.
lake=# \dx
List of installed extensions
Name | Version | Schema | Description
------------------------+---------+------------+------------------------------------------------------------
adminpack | 1.0 | pg_catalog | administrative functions for PostgreSQL
hl7 | 3.0 | hl7 | healthcare datatypes core
hl7basetable | 3.0 | hdl | healthcare datatypes base tables
hl7v3crud_edition2011 | 2.0 | public | hl7v3 edition2011 CRUD functions
hl7v3datatypes | 3.0 | hl7 | healthcare datatypes
hl7v3rim_edition2011 | 2.0 | public | hl7v3 edition2011 RIM schema
hl7v3vocab_edition2011 | 2.0 | hdl | hl7v3 edition2011 vocabulary
jsquery | 1.0 | public | data type for jsonb inspection
loinc_2_50 | 1.0 | public | loinc 2.50 vocabulary
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
snomedctvocab_20140131 | 1.0 | public | snomed-ct 20140131 vocabulary
tablefunc | 1.0 | public | functions that manipulate whole tables, including crosstab
ucum | 3.0 | hdl | Unified code for units of measure v1.8.2
(13 rows)
2.2.3. Loading additional useful extensions¶
The following extensions are useful for handling JSON types, making pivots and accessing the database using pgadmin3, respectively.
lake=# create extension jsquery;
CREATE EXTENSION
lake=# create extension tablefunc;
CREATE EXTENSION
lake=# create extension adminpack;
CREATE EXTENSION
2.2.4. Dockerfile create RIM database¶
The following Docker RUN command contains all the steps performed above. It can be appended to the Dockerfile described in section Dockerfile install repositories and packages. In addition it also loads the Snomed-CT and Loinc vocabularies that were installed in section Installing the packages, but not yet used.
RUN service postgresql-9.4 start && \
createuser -U postgres --superuser --login --createdb hcuser && \
createdb -U hcuser lake && \
psql -U hcuser -d lake -c 'CREATE EXTENSION hl7basetable' && \
psql -U hcuser -d lake -c 'CREATE EXTENSION ucum' && \
psql -U hcuser -d lake -c 'CREATE EXTENSION hl7' && \
psql -U hcuser -d lake -c 'CREATE EXTENSION hl7v3vocab_edition2011' && \
psql -U hcuser -d lake -c 'CREATE EXTENSION hl7v3datatypes' && \
psql -U hcuser -d lake -c 'CREATE EXTENSION hl7v3rim_edition2011' && \
psql -U hcuser -d lake -c 'CREATE EXTENSION hl7v3crud_edition2011' && \
psql -U hcuser -d lake -c 'ALTER DATABASE LAKE SET search_path=public,hdl,hl7,r1,"$user"' && \
# more vocabulary
# note: loading the Snomed-CT extension required >= 4GB ram!
psql -U hcuser -d lake -c 'CREATE EXTENSION snomedctvocab_20140131' && \
psql -U hcuser -d lake -c 'CREATE EXTENSION loinc_2_50' && \
# additional useful modules
psql -U hcuser -d lake -c 'CREATE EXTENSION jsquery' && \
psql -U hcuser -d lake -c 'CREATE EXTENSION tablefunc' && \
psql -U hcuser -d lake -c 'CREATE EXTENSION adminpack'
2.3. Backup and Restore¶
The official PostgreSQL backup and restore documentation describes three ways to backup databases.
- SQL Dump. MGRID HDM databases require special handling for performing SQL dumps and restore. This is described below.
- File System Level Backup. MGRID HDM requires no special handling for file system level backups. Please refer to PostgreSQL file system level backup for more information.
- Continuous Archiving and Point-in-Time Recovery. MGRID HDM requires no special handling for this kind of backup. Please refer to PostgreSQL Continuous Archiving for more information.
2.3.1. SQL dump and restore¶
It is not possible to perform a regular pg_dump and pg_restore of a MGRID HDM database, due to the way the extensions are organized. Consider the following scenario:
$ pg_dump -U hcuser -d lake > lake.dmp
$ createdb -U hcuser test
$ psql -U hcuser aap < lake.dmp
SET
SET
SET
SET
SET
SET
SET
CREATE SCHEMA
ALTER SCHEMA
CREATE SCHEMA
ALTER SCHEMA
CREATE EXTENSION
COMMENT
CREATE EXTENSION
COMMENT
CREATE EXTENSION
COMMENT
ERROR: type NullFlavor does not exist
ERROR: extension "hl7v3datatypes" does not exist
...
The reason for this error is that the dump contains the CREATE EXTENSION commands in this wrong order:
CREATE EXTENSION IF NOT EXISTS hl7basetable WITH SCHEMA hdl;
CREATE EXTENSION IF NOT EXISTS ucum WITH SCHEMA hdl;
CREATE EXTENSION IF NOT EXISTS hl7 WITH SCHEMA hl7;
CREATE EXTENSION IF NOT EXISTS hl7v3datatypes WITH SCHEMA hl7;
CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;
CREATE EXTENSION IF NOT EXISTS adminpack WITH SCHEMA pg_catalog;
CREATE EXTENSION IF NOT EXISTS hl7v3vocab_edition2011 WITH SCHEMA hdl;
CREATE EXTENSION IF NOT EXISTS hl7v3rim_edition2011 WITH SCHEMA public;
CREATE EXTENSION IF NOT EXISTS hl7v3crud_edition2011 WITH SCHEMA public;
CREATE EXTENSION IF NOT EXISTS jsquery WITH SCHEMA public;
CREATE EXTENSION IF NOT EXISTS loinc_2_50 WITH SCHEMA public;
CREATE EXTENSION IF NOT EXISTS snomedctvocab_20140131 WITH SCHEMA public;
CREATE EXTENSION IF NOT EXISTS tablefunc WITH SCHEMA public;
The order is wrong since hl7v3datatypes
requires vocabulary that is loaded by any hl7v3vocab_edition<year>
extension. The best solution to state such a dependency is to add hl7v3vocab_edition2011
to the hl7v3datatypes
control file, but this cannot be done since there are a number of alternatives for the vocabulary, and we do not want to
fix the hl7v3datatypes
extension to a specific normative edition.
Therefore, MGRID HDM databases can only be dumped and restored by performing data-only restores into databases that are created according to the original database definitions.
2.3.1.1. Data-only SQL dump¶
The example below shows how to perform a data only dump of the database lake
to the file lake.dmp
.
pg_dump --data-only --no-privileges --no-owner \
-U hcuser lake \
| sed -e '/SET search_path/ s/;/, hl7, hdl, r1;/' > lake.dmp
Note
Update the search path to the correct R1 or R2 schema.
The example above fixes the search_path during restore for databases with R1 datatypes. Change this into r2
to restore databases that use the R2 datatypes.
2.3.1.2. Restore of data-only SQL dump¶
Restore of a HDM database consists of two steps:
Create a database that matches the extensions used for the original database. See section Creating a database for more information.
Use the following command to restore the data-only dump into the new database:
cat lake.dmp | psql -U hcuser --single-transaction -v ON_ERROR_STOP=true lake
2.4. Upgrade¶
MGRID provides upgrades through RPM package updates. We make the distinction between minor updates and major updates. A minor update is an update where the first two numbers in the version number stay the same. All other updates are major updates.
2.4.1. Minor HDL updates¶
A minor update of the MGRID HDL package will place a new version of the hl7.so
shared library in the PostgreSQL
shared library directory. The PostgreSQL server does not need to be shutdown before this update is performed, nor is it
strictly required to restart the PostgreSQL server after a minor update. Running PostgreSQL backends (sessions) will
still use the old version, and new sessions with the database will use the new, updated version.
Minor updates will contain bug fixes, but do not alter API’s or usage of existing functions, and can thus be applied to systems in production.
$ sudo yum update mgridhdl3_95
...
Dependencies Resolved
========================================================================================================
Package Arch Version Repository Size
========================================================================================================
Updating:
mgridhdl3_95 x86_64 3.0.4-1 mgrid_mgrid3 432 k
Transaction Summary
========================================================================================================
Upgrade 1 Package
Total download size: 432 k
Is this ok [y/d/N]: y
...
Updated:
mgridhdl3_95.x86_64 0:3.0.4-1
Complete!
2.4.2. Major HDL updates¶
A major update of the MGRID HDL package will require updating the mgridhdl3
package first, as described in the
previous section, so the new binaries are
installed on the server. Then in each database that has uses the hl7
extension, perform the update with the following command:
ALTER EXTENSION hl7 UPDATE TO 3.1;
ALTER EXTENSION ucum UPDATE TO 3.1;
-- etc