19 45

Cited 0 times in

Development of cancer pathology data model and natural language processing based data conversion methodology

DC Field Value Language
dc.contributor.author신, 다혜-
dc.description.tableofcontentsI. Introduction 1
A. Study Background and Necessity 1
1. The Current Cancer Incidence in Korea 1
2. Importance of Cancer Pathology Report 3
3. Status of converting medical data into structured data based on Natural Language Processing method 4
4. The necessity of modeling and unifying cancer pathology data 8
B. Study Purpose 12
II. Materials and Method 14
A. Study Subject data 17
B. Analysis of system and vocabulary of the cancer pathology report 18
C. Development of Cancer Pathology Data Model 21
D. Development of natural language processing model for data extraction 27
1. Text preprocessing 33
2. Named Entity Recognition Model Structure 42
3. Model training 44
4. Realization of rule-based algorithm 47
E. Model evaluation method 50
F. Design of common data model conversion 51
III. Result 59
A. Establishment of Korean cancer pathology reports system and vocabulary dictionary 59
B. Comparison of Model performance for data extraction 67
1. Comparison of performance by parameter setting 67
2. Comparison of performance between CNN model and Hybrid (CNN+Rule-based) model 72
C. Development of data conversion methodology for the structured cancer pathology report 76
1. Establishment of clinical data model for cancer pathology report 79
2. Result evaluation through manual review of data 83
3. Establishment of common data model 84
IV. Discussion 88
A. Consideration on study method and result 88
B. Study limitations 97
V. Conclusion 99
References 101
dc.titleDevelopment of cancer pathology data model and natural language processing based data conversion methodology-
dc.title.alternative암 병리 데이터 모델 및 자연어처리기반 데이터 변환 방법론 개발-
dc.subject.keywordCancer Pathology report-
dc.subject.keywordNatural Language Processing-
dc.subject.keywordNamed Entity Recognition-
dc.subject.keywordConvolutional neural network-
dc.subject.keywordData Modeling-
dc.subject.keywordDistributed Research Network-
dc.subject.keywordCommon Data Model-
dc.contributor.department대학원 의학과-
dc.contributor.affiliatedAuthor신, 다혜-
Appears in Collections:
Theses > School of Medicine / Graduate School of Medicine > Doctor
Files in This Item:


해당 아이템을 이메일로 공유하기 원하시면 인증을 거치시기 바랍니다.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.