EDS-NER-CARDIOCCC
This repository contains the final NER model trained on the CardioCCC dataset. CardioCCC is a collection of cardiology clinical case reports used for domain adaptation. Clinical case reports are a textual genre in medicine that describe a patient鈥檚 medical history, symptoms, diagnosis, and treatment in detail.
The model implementation is based on EDS-NLP, a library developed by the data science team of the Greater Paris University Hospitals (AP-HP) for clinical natural language processing.
The entities that are detected are listed below.
| Label | Description |
|---|---|
MEDICATION |
Names of drugs or chemical substances used in treatment, e.g., Metformina. |
PROCEDURE |
Medical or surgical procedures performed on a patient, e.g., biopsia, radiograf铆a. |
DISEASE |
Diagnosed diseases or medical conditions, e.g., diabetes mellitus, hipertensi贸n. |
SYMPTOM |
Reported signs or symptoms experienced by a patient, e.g., fiebre, dolor de cabeza. |
Quickstart
Install the latest version of edsnlp
pip install "edsnlp[ml]" -ULoad the model
import edsnlp nlp = edsnlp.load("Aremaki/eds-ner-cardioccc", auto_update=True) doc = nlp( "La paciente con diabetes mellitus " "present贸 fiebre y se le realiz贸 " "una radiograf铆a antes de tomar metformina. " ) for ent in doc.ents: print(ent, ent.label_)
To apply the model on many documents using one or more GPUs, refer to the documentation of edsnlp.
Metrics
| Token Scores | Precision | Recall | F1 |
|---|---|---|---|
| MEDICATION | 93.0 | 94.0 | 93.0 |
| PROCEDURE | 85.0 | 85.0 | 85.0 |
| DISEASE | 82.0 | 82.0 | 82.0 |
| SYMPTOM | 80.0 | 81.0 | 80.0 |
Installation to reproduce
If you'd like to reproduce eds-ner-cardioccc's training or contribute to its development, you should first clone it:
git clone https://github.com/Aremaki/eds_ner_cardioccc.git
cd eds_ner_cardioccc
Acknowledgement
We would like to thank the Life science team at the Barcelona Supercomputing Center (BSC) who designed the CardioCCC dataset and trained the base model bsc-bio-ehr-es We would like to thank the data science team of the Greater Paris University Hospitals (AP-HP) who developped the EDS-NLP library.
- Downloads last month
- 36
Model tree for Aremaki/eds-ner-cardioccc
Base model
PlanTL-GOB-ES/bsc-bio-ehr-esEvaluation results
- Token Scores / MEDICATION / Precision on CardioCCCself-reported0.930
- Token Scores / MEDICATION / Recall on CardioCCCself-reported0.940
- Token Scores / MEDICATION / F1 on CardioCCCself-reported0.930
- Token Scores / PROCEDURE / Precision on CardioCCCself-reported0.850
- Token Scores / PROCEDURE / Recall on CardioCCCself-reported0.850
- Token Scores / PROCEDURE / F1 on CardioCCCself-reported0.850
- Token Scores / DISEASE / Precision on CardioCCCself-reported0.820
- Token Scores / DISEASE / Recall on CardioCCCself-reported0.820
- Token Scores / DISEASE / F1 on CardioCCCself-reported0.820
- Token Scores / SYMPTOM / Precision on CardioCCCself-reported0.800