Named entity disambiguation for archival collections: Metadata, Wikidata, and Linked data (2021)

Representing archival metadata as linked data can increase findability and usability of items, and linked data sources such as Wikidata can be used to further enrich existing collection metadata. However, a central challenge to this process is the named entity disambiguation or entity linking that is required to ensure that the named entities in a collection are being properly matched to Wikidata entities so that any additional metadata is applied correctly. This paper details our experimentation with one entity linking system called OpenTapioca, which was chosen for its use of Wikidata and its accessibility to librarians and archivists with minimal technical intervention. We discuss the results of using OpenTapioca for named entity disambiguation on the Belfer Cylinders Collection from the Special Collections Research Center at Syracuse University, highlighting the successes and limitations of the system and of using Wikidata as a knowledge base.
VIEW PUBLICATION >>>
  • Polley, K., Tompkins, V., Honick, B., & Qin, J. (2021). Named entity disambiguation for archival collections: Metadata, Wikidata, and Linked data. In: Proceedings of 84th ASIST Annual Meeting, October 30-November 2, 2021, Salt Lake City, UT. 

Knowledge organization and representation under the AI lens (2020)

T his paper compares the paradigmatic differences between knowledge organization (KO) in library and information science and knowledge representation (KR) in AI to show the convergence in KO and KR methods and applications. Knowledge organization systems (KOS) are developed to represent knowledge in publications and in natural and societal environments and used for information discovery and retrieval. Depending on the purpose, a KOS may be general and broad, such as the Library of Congress Subject Headings (LCSH), which is used to index books and other publications in library collections, while others may be very specific, such as the National Center for Biotechnology Information (NCBI) Taxonomy that serves as a nomenclature and classification for organisms (NCBI, 2018).

This paper will first briefly review the historical background of KR in both KO and AI in the last five decades, during which schematic representations of data and information became the main driving force for modernizing knowledge organization and representation. This brief review is by no means to be exhaustive and complete, but rather, it intends to present evidence to demonstrate the fundamental ideas of KR as a background understanding. While the historical background of KO and KR allows us to see and compare KR paradigms between traditional KO and AI, it is important to understand where paradigmatic similarities exist and how the two parallel fields are converging. Following the brief review and analysis, paradigmatic similarities and convergence of KR in KO and AI are discussed and case studies used to demonstrate the convergence trend.

VIEW PUBLICATION >>>
  • Qin, J. (2020). Knowledge organization and representation under the AI lens. Journal of Data and Information Science. 5(1): 3–17. DOI: 10.2478/jdis-2020-0002  

A relation typology in knowledge organization systems: Case studies in the research data management domain (2018)

Relations between concepts (and/or entities, events, and other things) vary depending on the criteria by which relations are defined or viewed. In the domain of research data management, different types of research generate different types of data and terminologies vary between practitioners and basic science researchers even within the same disciplinary domain. Interactions between datasets, between datasets and documentation, and between datasets and computing code can result in different types of relations.

Interactions between datasets, between datasets and documentation, and between datasets and computing code can result in different types of relations. This paper employs a framework of analysis to study concept relation types in the research data management domain. By using two cases one is the GenBank annotation records and the other is the data and artifact collection from a gravitational wave search, this paper demonstrates the types of relations existing in and between datasets, publications, computing codes, and workflows. The analysis and generalization of these relations references the research in AI’s knowledge representation and knowledge organization systems (KOS), including both ad hoc subject categories and formal KOS, because in the next AI era, relations as one of the key components of AI applications will be required to function not only as part of KOS for indexing data and publications, but more importantly, to function as codifiable knowledge for machine consumption.

VIEW PUBLICATION >>>
  • Qin, J. (2018). A relation typology in knowledge organization systems: Case studies in the research data management domain. In: Proceedings of the Fifteenth International ISKO Conference, Porto, Portugal, July 9-11, 2018.   
  • Polley, K., Tompkins, V., Honick, B., & Qin, J. (2021). Named entity disambiguation for archival collections: Metadata, Wikidata, and Linked data. In: Proceedings of 84th ASIST Annual Meeting, October 30-November 2, 2021, Salt Lake City, UT. 
VIEW PUBLICATION >>>
  • Qin, J. (2020). Knowledge organization and representation under the AI lens. Journal of Data and Information Science. 5(1): 3–17. DOI: 10.2478/jdis-2020-0002  
VIEW PUBLICATION >>>
VIEW PUBLICATION >>>
  • Qin, J., B. Yu, & L. Wang. (2018). Knowledge node and relation detection. Networked Knowledge Organization Systems (NKOS) Workshop at the Dublin Core International Conference DC-2018, Porto, Portugal, September 13, 2018. http://ceur-ws.org/Vol-2200/paper3.pdf   
VIEW PUBLICATION >>>
  • Qin, J. (2018). A relation typology in knowledge organization systems: Case studies in the research data management domain. In: Proceedings of the Fifteenth International ISKO Conference, Porto, Portugal, July 9-11, 2018.   
VIEW PUBLICATION >>>
  • Qin, J., & Zou, N. (2017). Structures and Relations of Knowledge Nodes: Exploring a Knowledge Network of Disease from Precision Medicine Research Publications. In iConference 2017 Proceedings (pp. 56–65).   
VIEW PUBLICATION >>>

PROJECT CONTRIBUTORS:

jqin-headshot
Jian Qin, PI
Liya Wang, Graduate RA
Ning Zhou, Graduate RA
Bei Yu, Professor