HiRID, a higher time-resolution icu dataset. Anonymization procedure

HiRID, a higher time-resolution icu dataset. Anonymization procedure

Posted Variation: 1.0


HiRID is really a easily available critical care dataset containing data concerning very nearly 34 thousand patient admissions towards the Department of Intensive Care Medicine regarding the Bern University Hospital, Switzerland (ICU), an interdisciplinary 60-bed device admitting >6,500 clients each year. The ICU provides the complete selection of contemporary interdisciplinary intensive care medication for adult clients. The dataset was created in cooperation involving the Swiss Federal Institute of tech (ETH) ZГјrich, Switzerland as well as the ICU.

The dataset contains de-identified information that is demographic a total of 681 regularly gathered physiological factors, diagnostic test outcomes and therapy parameters from very nearly 34 thousand admissions throughout the duration. Information is saved having a time that is uniquely high of just one entry every 120 seconds.


Critical infection is seen as an the existence or chance of developing deadly organ dysfunction. Critically sick clients are usually maintained in intensive care units (ICUs), which concentrate on supplying monitoring that is continuous advanced therapeutic and diagnostic technologies. This dataset ended up being gathered during routine care in the Department of Intensive Care Medicine of this Bern University Hospital, Switzerland (ICU), an interdisciplinary unit that is 60-bed >6,500 clients each year. It absolutely was initially removed to aid a research in the very early forecast of circulatory failure into the intensive care product machine learning 1 that is using. The latest paperwork for the dataset is available2.


The HiRID database includes a selection that is large of routinely gathered data relating to patient admissions to your Department of Intensive Care Medicine for the Bern University Hospital, Switzerland (ICU). The info had been removed from the ICU individual information Management System that will be accustomed prospectively register patient wellness information, dimensions of organ function parameters, outcomes of laboratory tests and therapy parameters from ICU admission to discharge.

Measurements from bedside monitoring

Dimensions and settings of medical products such as for instance technical air flow

Findings by medical care providers e.g.: GCS, RASS, urine as well as other fluid production

Administered drugs, liquids and nourishment

HiRID has a greater time quality than many other posted datasets, above all for bedside monitoring with many parameters recorded every 120 seconds.

To guarantee the anonymization of people in the information set, we observed the procedures effectively sent applications for the MIMIC-IIwe and Amsterdam UMC db dataset, which adopted the wellness Insurance Portability and Accountability Act (HIPAA) secure Harbor demands and, when it comes to Amsterdam UMC db, additionally europe’s General information Protection Regulation (GDPR) standards 3,4.

Elimination of all eighteen data that are identifying placed in HIPAA

Times were shifted by a random offset in a way that the admission date lies. We made certain to protect the seasonality, period of time therefore the day’s week.

Individual age, weight and height are binned into containers of size 5. The max bin is 90 years and contains also all older patients for patient age.

Dimensions and medicines with changing devices with time had been standardised into the unit that is latest utilized. This standardization had been essential to make a summary about calculated admission times, in line with the devices utilized in a particular client, impossible.

Complimentary text had been taken from the database

k-anonymization ended up being used on patient age, weight, height and intercourse.

Ethical approval and consent that is patient

The review that is institutional (IRB) associated with the Canton of Bern authorized the analysis. The necessity for acquiring informed client consent ended up being waived due to the retrospective and observational nature regarding the research.

Information Description

The general information is for sale in two states: as natural information and/or as pre-processed information. Also you will find three guide tables for adjustable lookup.

Guide tables

adjustable guide – guide dining dining dining table for factors (for natural phase)

ordinal reference that is adjustable reference dining dining table for categorical/ordinal variables for string value lookup

pre-processed adjustable guide – guide dining dining dining table for factors (for merged and stage that is imputed

Natural information

The raw information was just prepared if this is necessary for patient de-identification and otherwise left unchanged set alongside the source that is original. The foundation information offers the set that is complete of factors (685 factors). It comprises of the tables that are following

Preprocessed information

The pre-processed information is made of intermediary pipeline phases from the accompanying book by Hyland et1 that is al. Supply factors representing the exact same medical principles had been merged into one meta-variable per concept. The info provides the 18 many predictive meta-variables just, as defined inside our book. Two various phases associated with pipeline can be obtained

Merged phase source factors are merged into meta-variables by medical ideas e.g. non-opioid-analgesics. The full time grid is kept unchanged and it is sparse.

Imputed phase the information through the merged stage is down sampled up to a time grid that is five-minute. Enough time grid is full of imputed values. The imputation strategy is complex and it is talked about when you look at the publication that is original.

The rule utilized to create these phases are located in this GitHub repository beneath the folder 5 that is preprocessing.

Which information to make use of?

The pre-processed information is intended primarily as a way that is quick jump-start a task and for use within an evidence of concept. We suggest with the supply data whenever you can for regular tasks. It’s the many versatile type and possesses the entire collection of factors into the initial time quality.

Information platforms

Information is for sale in two platforms: CSV for wide compatibility and Apache Parquet for convenience and gratification.

Considering that the information sets are fairly big, these are generally put into partitions, in a way that they may be prepared in parallel in a way that is straightforward. The lookup dining dining table mapping patient id to partition id is supplied into the file known as together with the information. The partitions are aligned between your different data sets and tables, so that the info of someone can invariably be located when you look at the partition aided by the id that is same. Note however, that an individual may well not take place in all data sets, e.g. a patient may be lacking when you look at the data that are preprocessed because someone did not meet with the demographic requirements become contained in the research.

Patient ID / ICU admission

The dataset treats each ICU admission uniquely and it is impossible to determine numerous ICU admissions as originating from the exact same client. A unique “Patient ID” is generated for each ICU ( re-)admission.

Information schemata

The schemata of each dining dining dining table are located in the *schemata.pdf* file.

Use Records

While the database contains detailed information about the care that is clinical of, it should be addressed with appropriate care and respect.

Scientists have to formally request access via PhysioNet. To be issued access, an individual has to be considered a credentialed PhysioNet user, digitally signal the info Use Agreement and offer a particular research concern.

Conflicts of Interest

The writers declare no disputes of great interest


Access Policy: Only PhysioNet credentialed users whom signal the specified DUA can access the files.

Deja un comentario

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *