Unsupervised extraction, classification and visualization of clinical note segments using the MIMIC-III dataset

Varování

Publikace nespadá pod Ekonomicko-správní fakultu, ale pod Fakultu informatiky. Oficiální stránka publikace je na webu muni.cz.
Autoři

ZELINA Petr HALÁMKOVÁ Jana NOVÁČEK Vít

Rok publikování 2023
Druh Článek ve sborníku
Konference Proceedings of IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
Doi http://dx.doi.org/10.1109/BIBM58861.2023.10385342
Klíčová slova NLP; EHR; Clinical Notes; Information Extraction; Text Classification
Popis This paper presents a text-mining approach to extracting and organizing segments from unstructured clinical notes in an unsupervised way. Our work is motivated by the real challenge of poor semantic integration between clinical notes produced by different doctors, departments, or hospitals. This can lead to clinicians overlooking important information, especially for patients with long and varied medical histories. This work extends a previous approach developed for Czech breast cancer patients and validates it on the publicly accessible MIMIC-III English dataset, demonstrating its universal and language-independent applicability. Our work is a stepping stone to a broad array of downstream tasks, such as summarizing or integrating patient records, extracting structured information, or computing patient embeddings. Additionally, the paper presents a clustering analysis of the latent space of note segment types, using hierarchical clustering and an interactive treemap visualization. The presented results demonstrate that this approach generalizes well for MIMIC and English.
Související projekty:

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.