TY - GEN
T1 - Automatic Extension of Medical Subject Headings (MeSH) Thesaurus to Emerging Research
AU - Gasper, William
AU - Ma, Jiahao
AU - Ghersi, Dario
AU - Gnimpieba, Etienne Z.
AU - Gadhamshetty, Venkataramana
AU - Chundi, Parvathi
N1 - Funding Information:
Acknowledgements: Authors gratefully acknowledge the support from the grant NSF EPSCoR RII T-2 FEC Grant-1920954.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - The proliferation of information technology infrastructure in recent decades has allowed for unprecedented ease of access to centrally-aggregated scholarly literature and scientific knowledge. This massive aggregation of knowledge requires an information retrieval infrastructure, to include formalized ontologies, that is engineered with careful consideration. A number of domains benefit from the use of hierarchical controlled vocabularies, which may be used to provide a rich set of descriptive terms for characterizing entities in a consistent manner. There are clear benefits to the creation and maintenance of these ontologies: search and retrieval is made easier and analyses of the contained entities are enabled that would not otherwise be possible. However, there may be the opportunity to decrease the manual burden of ontology creation and maintenance with automated methods that leverage natural language processing and other computational techniques. This work presents an automated ontology creation methodology, adapted and expanded from prior work [1], that can produce a topic hierarchy from natural language and may be used to assist in the creation of a novel ontology or the expansion of existing ontologies. The effectiveness of the proposed method is studied using two examples: immunology, an established biomedical domain and a prominent topic in MeSH, and graphene, from the 2D materials domain with wide-ranging biomedical applications, which also has a sparse presence in MeSH
AB - The proliferation of information technology infrastructure in recent decades has allowed for unprecedented ease of access to centrally-aggregated scholarly literature and scientific knowledge. This massive aggregation of knowledge requires an information retrieval infrastructure, to include formalized ontologies, that is engineered with careful consideration. A number of domains benefit from the use of hierarchical controlled vocabularies, which may be used to provide a rich set of descriptive terms for characterizing entities in a consistent manner. There are clear benefits to the creation and maintenance of these ontologies: search and retrieval is made easier and analyses of the contained entities are enabled that would not otherwise be possible. However, there may be the opportunity to decrease the manual burden of ontology creation and maintenance with automated methods that leverage natural language processing and other computational techniques. This work presents an automated ontology creation methodology, adapted and expanded from prior work [1], that can produce a topic hierarchy from natural language and may be used to assist in the creation of a novel ontology or the expansion of existing ontologies. The effectiveness of the proposed method is studied using two examples: immunology, an established biomedical domain and a prominent topic in MeSH, and graphene, from the 2D materials domain with wide-ranging biomedical applications, which also has a sparse presence in MeSH
UR - http://www.scopus.com/inward/record.url?scp=85125169394&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125169394&partnerID=8YFLogxK
U2 - 10.1109/BIBM52615.2021.9669520
DO - 10.1109/BIBM52615.2021.9669520
M3 - Conference contribution
AN - SCOPUS:85125169394
T3 - Proceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
SP - 3570
EP - 3577
BT - Proceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
A2 - Huang, Yufei
A2 - Kurgan, Lukasz
A2 - Luo, Feng
A2 - Hu, Xiaohua Tony
A2 - Chen, Yidong
A2 - Dougherty, Edward
A2 - Kloczkowski, Andrzej
A2 - Li, Yaohang
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
Y2 - 9 December 2021 through 12 December 2021
ER -