Getting Started with NegBio

These instructions will get you a copy of the project up and run on your local machine for development and testing purposes. The package should successfully install on Linux (and possibly macOS).

Installing

Prerequisites

  • python >2.4
  • Linux
  • Java

Note: since v1.0, MetaMap is not required. You can use the CheXpert vocabularies (negbio/chexpert/phrases) instead. If you want to use MetaMap, it can be downloaded from https://metamap.nlm.nih.gov/MainDownload.shtml. Installation instructions can be found at https://metamap.nlm.nih.gov/Installation.shtml. Please make sure that both skrmedpostctl and wsdserverctl are started.

Installing from pip

$ pip install negbio

Using NegBio

Prepare the dataset

The inputs can be in either plain text or BioC format. If the reports are in plain text, each report needs to be in a single file. Some examples can be found in the examples folder.

Run the script

There are two ways to run the pipeline.

Using CheXpert algorithm

If you want to use the CheXpert method, run one of the following lines

$ main_chexpert text --output=examples/test.neg.xml examples/00000086.txt examples/00019248.txt
$ main_chexpert bioc --output=examples/test.neg.xml examples/1.xml

The script will

  1. [Optional] Combine examples/00000086.txt and examples/00019248.txt into one BioC XML file
  2. Detect concepts using CheXpert pre-defined vocabularies (by default using the list negbio/chexpert/phrases)
  3. Detect positive, negative and uncertain concepts using rules in negbio/chexpert/patterns
  4. Save the results in examples/test.neg.xml

More options (e.g., setting the CUI list or rules) can be obtained by running

$ main_chexpert --help

Using MetaMap

If you want to use MetaMap, run the following command by replacing <METAMAP_BIN> with the actual ABSOLUTE path, such as META_MAP_HOME/bin/metamap16

$ export METAMAP_BIN=META_MAP_HOME/bin/metamap16
$ main_mm text --metamap=$METAMAP_BIN --output=examples/test.neg.xml \
     examples/00000086.txt examples/00019248.txt
$ export METAMAP_BIN=META_MAP_HOME/bin/metamap16
$ main_mm bioc --metamap=$METAMAP_BIN --output=examples/test.neg.xml examples/1.xml

The script will

  1. [Optional] Combine examples/00000086.txt and examples/00019248.txt into one BioC XML file
  2. Detect UMLS concepts (CUIs) using MetaMap (by default using the CUI list examples/cuis-cvpr2017.txt
  3. Detect negative and uncertain CUIs using rules in negbio/patterns
  4. Save the results in examples/test.neg.xml

More options (e.g., setting the CUI list or rules) can be obtained by running

$ main_mm --help

Next Steps

To start learning how to use NegBio, see the NegBio User Guide.