Please use this identifier to cite or link to this item: https://ir.swu.ac.th/jspui/handle/123456789/17429
Title: Towards Smart Data Management of Scientific Literature: Addressing Polysemy and Aberrant Decoding in Author Names
Authors: Khalid S.
Hassan S.-U.
Sitthi A.
Issue Date: 2021
Abstract: In digital libraries, ambiguous author names may occur due to the existence of multiple authors with the same name (polysemes) or different name variations for the same author (synonyms). We attempt to disambiguate the authors of scientific publications based on the attributes, which define the similarity among the publications belonging to a unique author. We apply two supervised machine-learning approaches, namely Support Vector Machine and Naïve Bayes, for training the classifier with commonly available features in bibliography databases such as author affiliation, subject area, journal title, city, references, and keywords. We opt not to choose features like author contact details, such as phone numbers or email addresses, which are usually not available in bibliography databases. Furthermore, we test our model using an extremely ambiguous dataset, which consists of Chinese authors with identical names, affiliated with the same institute, and even having the same research area. The dataset is downloaded from Scopus containing 5180 publication records with nine different authors having a same name, i.e., “Zhang Wei,”. Our model shows a very encouraging accuracy of 95.68% using the Naïve Bayes classifier, and the Support Vector Machine is about 3% better with the polynomial kernel when deployed on our dataset. Overall, the implications of this research are not only limited to improving the data management systems for scholarly search systems, however other databases with name disambiguation may also benefit from the proposed technique. © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.
URI: https://ir.swu.ac.th/jspui/handle/123456789/17429
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85116411238&doi=10.1007%2f978-3-030-84311-3_40&partnerID=40&md5=7fe1025d2a742aa5c5805317a38f0282
ISSN: 22138684
Appears in Collections:Scopus 1983-2021

Files in This Item:
There are no files associated with this item.


Items in SWU repository are protected by copyright, with all rights reserved, unless otherwise indicated.