Can we track the geography of surnames based on bibliographic data?

Abstract

In this paper we explore the possibility of using bibliographic databases for tracking the geographic origin of surnames. Surnames are used as a proxy to determine the ethnic, genetic or geographic origin of individuals in many fields such as Genetics or Demography; however they could also be used for bibliometric purposes such as the analysis of scientific migration flows. Here we present two relevant methodologies for determining the most probable country to which a surname could be assigned. The first methodology assigns surnames based on the most common country that can be assigned to a surname and the Kullback-Liebler divergence measure. The second method uses the Gini Index to evaluate the assignment of surnames to countries. We test both methodologies with control groups and conclude that, despite needing further analysis on its validity; these methodologies already show promising results.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…