EDS-NER-CARDIOCCC

This repository contains the final NER model trained on the CardioCCC dataset. CardioCCC is a collection of cardiology clinical case reports used for domain adaptation. Clinical case reports are a textual genre in medicine that describe a patient鈥檚 medical history, symptoms, diagnosis, and treatment in detail.

The model implementation is based on EDS-NLP, a library developed by the data science team of the Greater Paris University Hospitals (AP-HP) for clinical natural language processing.

The entities that are detected are listed below.

Label Description
MEDICATION Names of drugs or chemical substances used in treatment, e.g., Metformina.
PROCEDURE Medical or surgical procedures performed on a patient, e.g., biopsia, radiograf铆a.
DISEASE Diagnosed diseases or medical conditions, e.g., diabetes mellitus, hipertensi贸n.
SYMPTOM Reported signs or symptoms experienced by a patient, e.g., fiebre, dolor de cabeza.

Quickstart

  1. Install the latest version of edsnlp

    pip install "edsnlp[ml]" -U
    
  2. Load the model

    import edsnlp
    
    nlp = edsnlp.load("Aremaki/eds-ner-cardioccc", auto_update=True)
    doc = nlp(
        "La paciente con diabetes mellitus "
        "present贸 fiebre y se le realiz贸 "
        "una radiograf铆a antes de tomar metformina. "
    )
    
    for ent in doc.ents:
        print(ent, ent.label_)
    

To apply the model on many documents using one or more GPUs, refer to the documentation of edsnlp.

Metrics

Token Scores Precision Recall F1
MEDICATION 93.0 94.0 93.0
PROCEDURE 85.0 85.0 85.0
DISEASE 82.0 82.0 82.0
SYMPTOM 80.0 81.0 80.0

Installation to reproduce

If you'd like to reproduce eds-ner-cardioccc's training or contribute to its development, you should first clone it:

git clone https://github.com/Aremaki/eds_ner_cardioccc.git
cd eds_ner_cardioccc

Acknowledgement

We would like to thank the Life science team at the Barcelona Supercomputing Center (BSC) who designed the CardioCCC dataset and trained the base model bsc-bio-ehr-es We would like to thank the data science team of the Greater Paris University Hospitals (AP-HP) who developped the EDS-NLP library.

Downloads last month
36
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for Aremaki/eds-ner-cardioccc

Finetuned
(58)
this model

Evaluation results