OHNLP Downloads

From Open Health Natural Language Processing (OHNLP) Consortium
Jump to: navigation, search

The ohnlp project on sourceforge.net hosts some of the OHNLP downloads.

For others, see the Tool List.


The annotators and pipelines released in this consortium are built on top of the UIMA framework. UIMA can be freely downloaded from the Apache UIMA site. The documentation for the UIMA framework can be also found there. The framework facilitates building new pipelines, enhancing published components and building new annotators.

This consortium releases complete pipelines and individual annotators. The pipelines exemplify particular use cases and include external resources such as models, dictionaries and ontologies. Included are explanations of the settings of annotator parameters in associated configuration files.

medKAT/P is a pipeline contributed by IBM which extracts cancer characteristics such as primary and metastatic tumors and their attributes (named entity--e.g.: diagnosis, anatomical site, grade, size ...) from semi-structured and unstructured pathology reports. Each named entity has attributes for the text span, the ontology mapping code, and where applicable whether the named entity is negated. In addition, the relations between these named entities are made explicit, for instance defining a tumor or the lymph node status.

The medKAT/P user guide is available here MedKATp_UserGuide.

cTAKES 1.0 through cTAKES 2.5 are available from sourceforge.net. Later versions are available from ctakes.apache.org.

MedCoref can be downloaded from sourceforge.net.

PEP can be checked out from svn.


cTAKES 2.5

MedKAT/P user guide

MedCoref's documentation is included within the download.

PEP's documentation can be downloaded from here.


The installation instructions provide the download locations or you can download binaries, source code and associated resources such as models, dictionaries and ontologies from SourceForge.net.

OHNLP Slides and Presentations

Open Health NLP Consortium Workshop

slides from the AMIA 2009 affiliate event

cTAKES overview:

Narrated cTAKES overview
cTAKES overview slides
cTAKES 1.0.5 pipeline flow and type system slides
MedKAT pipeline overview slides

OHNLP Whitepapers

Rapidly Deployable, Highly Scalable Natural Language Processing Using Cloud Computing and an Open Source NLP Pipeline.
David Baldwin and David Carrell
Group Health Research Institute, 2010.