Assessing the level of merging errors for coauthorship data: a Bayesian model
Abstract
Robust analysis of coauthorship networks is based on high quality data. However, ground-truth data are usually unavailable. Empirical data suffer several types of errors, a typical one of which is called merging error, identifying different persons as one entity. Specific features of authors have been used to reduce these errors. We proposed a Bayesian model to calculate the information of any given features of authors. Based on the features, the model can be utilized to calculate the rate of merging errors for entities. Therefore, the model helps to find informative features for detecting heavily compromised entities. It has potential contributions to improving the quality of empirical data.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.