Gap-weighted subsequences for automatic cognate identification and phylogenetic inference
Abstract
In this paper, we describe the problem of cognate identification and its relation to phylogenetic inference. We introduce subsequence based features for discriminating cognates from non-cognates. We show that subsequence based features perform better than the state-of-the-art string similarity measures for the purpose of cognate identification. We use the cognate judgments for the purpose of phylogenetic inference and observe that these classifiers infer a tree which is close to the gold standard tree. The contribution of this paper is the use of subsequence features for cognate identification and to employ the cognate judgments for phylogenetic inference.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.