Использование разных форматов данных затрудняет стандартизацию и обмен медицинских данных. Более того, большая часть медицинских данных хранится в виде неструктурированных медицинских записей, что затрудняет их обработку. В данной работе мы решаем задачу категоризации неструктурированных аллергологических анамнезов по категориям, предоставленным в стандарте обмена FHIR. Была разработана двухэтапная модель классификации на основе размеченных вручную медицинских записей. На первом этапе модель фильтрует записи с информацией об аллергии, а на втором этапе классифицирует каждую запись. Модель показала высокую точность. Развитие предложенного подхода обеспечит вторичное использование и обмен данными.
1. Douglas HE, et al. Implementing information and communication technology to support community aged care service integration: Lessons from an Australian aged care provider. J. Integr. Care. Igitur, Utrecht Publishing and Archiving Services. 2017; 17(1).
2. Fung KW, et al. Using SNOMED CT-encoded problems to improve ICD-10-CM coding—A randomized controlled experiment. J. Med. Inform. Elsevier Ireland Ltd. 2019; 126: 19–25.
3. Fiebeck J, et al. Implementing LOINC: Current status and ongoing work at the Hannover Medical School. Studies in Health Technology and Informatics. IOS Press. 2019; 258: 247–248.
4. Mascia C, et al. OpenEHR modeling for genomics in clinical practice. J. Med. Inform. Elsevier Ireland Ltd. 2018; 120: 147–156.
5. Santos MR, Bax MP, Kalra D. Building a logical EHR architecture based on ISO 13606 standard and semantic web technologies. Studies in Health Technology and Informatics. IOS Press. 2010; 160(1): 161–165.
6. Ulrich H, et al. Metadata repository for improved data sharing and reuse based on HL7 FHIR. Studies in Health Technology and Informatics. IOS Press. 2017; 228: 162–166.
7. Huff S.M, et al. Integrating detailed clinical models into application development tools. Stud. Health Technol. Inform. IOS Press. 2004; 107: 1058–1062. 8. Hong N, et al. Standardizing Heterogeneous Annotation Corpora Using HL7 FHIR for Facilitating their Reuse and Integration in Clinical NLP.AMIA. Annu. Symp. proceedings. 2018; 2018: 574–583. 9. Lenivtceva ID, Kopanitsa G. Evaluating Manual Mappings of Russian Proprietary Formats and Terminologies to FHIR. Methods Inf. Med. 2019; 58: 4–5.
10. Wang Y, et al. Clinical information extraction applications: A literature review. Journal of Biomedical Informatics. 2018; 77: 34–49. 11. Dudchenko A, Ganzinger M, Kopanitsa G. Diagnoses Detection in Short Snippets of Narrative Medical Texts. Procedia Computer Science. 2019; 156: 150–157.
12. Shanavas N, et al. Ontology-based enriched concept graphs for medical document classification. Inf. Sci. (Ny). 2020; 525: 172–181.
13. Oleynik Michel , et al. Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification. Journal of the American Medical Informatics Association. Oxford Academic. 2013; 26(11): 1247–1254.
14. Weng W-H, et al. Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach. BMC Med. Inform. Decis. Mak. 2017; 17(1): 155.
15. Tafti AP, et al. BigNN: An open-source big data toolkit focused on biomedical sentence classification. Proceedings–2017 IEEE International Conference on Big Data. Institute of Electrical and Electronics Engineers Inc. 2017; 2018: 3888–3896.
16. Ye Y, et al. Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers. J. Am. Med. Informatics Assoc. BMJ Publishing Group. 2014; 21(5): 815–823.