Data

fiktive Nephrologie-Verlaufsnotizen (fictitious clinical notes)
This small dataset is a collection of fictitious clinical notes from the nephrology domain written by different students (medical & lingustic). Documents imitate the style and content of clinical notes of the nephrology, thus they are suitable for testing NLP applications as real patient data is difficult to share. Note, our fictitious clinical notes are not necessarily correct from medical perspective. Download

German NegEx trigger set
This set of trigger words has been created for negation detection in German clinical notes and discharge summaries. More information can be found here. Download

Tools & Models

Dependency Tree Parser for German medical text
Using the Stanford parser we created a domain-adapted dependency tree parser specialized for German medical text. The model has been pre-trained on a large general dataset in German and then re-trained on a small set of clinical documents of the nephrology domain. The model and a more detailed description can be found here.

Biomedical-CharTranslator
Many NLP tasks apply a concept normalization (alignment), which links a given mention to an identical concept within an ontology. Applying this task to another language than English might be more challenging, as non-English data is often underrepresented. Beside that, in the biomedical domain many terms are of Greek and Latin origin. Taking this into account and knowing characteristics between two languages, a large range of biomedical terms can be easily translated from one into another language. Our biomedical CharTranslator bases on this idea and uses a simple neural translator on character level. In this way concept normalization can be improved by translating "unknown" words and extending the search by including English data. The tool and models can be found here.