Recognizing shallow linguistic patterns, such as basic syntactic relationships between words, is a common task in applied natural language and text processing. The common practice for approaching…
Computation and Language papers
- SAShlomo Argamon, Ido Dagan, Yuval Krymolowski
This paper presents the results of using Roget's International Thesaurus as the taxonomy in a semantic similarity measurement task. Four similarity metrics were taken from the literature and…
MHMichael Mc HaleThe purpose of this paper is to explore some semantic problems related to the use of linguistic ontologies in information systems, and to suggest some organizing principles aimed to solve such…
NGNicola GuarinoThe aim of this paper is to define a dependency grammar framework which is both linguistically motivated and computationally parsable. See the demo at http://www.conexor.fi/analysers.html#testing
TJTimo Jarvinen, Pasi TapanainenThe Earley algorithm is a widely used parsing method in natural language processing applications. We introduce a variant of Earley parsing that is based on a ``delayed'' recognition of…
MNMark-Jan Nederhof, Giorgio SattaIn this paper, we provide an account of how to generate sentences with coordination constructions from clause-sized semantic representations. An algorithm is developed to generate sentences with…
JSJames ShawFinding simple, non-recursive, base noun phrases is an important subtask for many natural language processing applications. While previous empirical methods for base NP identification have been…
CCClaire Cardie, David PierceA radio speech corpus of 9mn has been prosodically marked by a phonetician expert, and non expert listeners. this corpus is large enough to train and test an automatic boundary spotting system,…
VPV. Pagel, N. Carbonell, Y. Laprie, J. VaissiereMultiple default inheritance formalisms for lexicons have attracted much interest in recent years. I propose a new efficient method to access such lexicons. After showing two basic strategies for…
SHSven HartrumpfThis paper proposes decoupling the dependency tree from word order, such that surface ordering is not determined by traversing the dependency tree. We develop the notion of a word order domain…
NBNorbert BroekerIt has been argued that, when learning a first language, babies use a series of small clues to aid recognition and comprehension, and that one of these clues is word length. In this paper we present…
SCSimon CozensThis paper presents a multidimensional Dependency Grammar (DG), which decouples the dependency tree from word order, such that surface ordering is not determined by traversing the dependency tree. We…
NBNorbert BroekerThis paper presents trainable methods for generating letter to sound rules from a given lexicon for use in pronouncing out-of-vocabulary words and as a method for lexicon compression. As the…
VPV. Pagel, K. Lenzo, A. BlackIn relatively free word order languages, grammatical functions are intricately related to case marking. Assuming an ordered representation of the predicate-argument structure, this work proposes a…
CBCem BozsahinDetermining the attachments of prepositions and subordinate conjunctions is a key problem in parsing natural language. This paper presents a trainable approach to making these attachments through…
AYAlexander S. Yeh, Marc B. VilainIsometric Lineation in English Texts: An Empirical and Mathematical Examination of its Character and Consequences
cmp-lgIn this paper we build on earlier observations and theory regarding word length frequency and sequential distribution to develop a mathematical characterization of some of the language features…
HAHideaki Aoyama, John ConstableWe present work in progress on abstracting dialog managers from their domain in order to implement a dialog manager development tool which takes (among other data) a domain description as input and…
BLBernd Ludwig, Guenther Goerz, Heinrich NiemannWord Length Frequency and Distribution in English: Observations, Theory, and Implications for the Construction of Verse Lines
cmp-lgRecent observations in the theory of verse and empirical metrics have suggested that constructing a verse line involves a pattern-matching search through a source text, and that the number of found…
HAHideaki Aoyama, John ConstableParallel corpora are a valuable resource for machine translation, but at present their availability and utility is limited by genre- and domain-specificity, licensing restrictions, and the basic…
PRPhilip ResnikThe classical, vector space model for text retrieval is shown to give better results (up to 29% better in our experiments) if WordNet synsets are chosen as the indexing space, instead of word forms.…
JGJulio Gonzalo, Felisa Verdejo, Irina Chugur, Juan CigarranWe present an empirical study of the applicability of Probabilistic Lexicalized Tree Insertion Grammars (PLTIG), a lexicalized counterpart to Probabilistic Context-Free Grammars (PCFG), to problems…
RHRebecca HwaIn this paper we present early work on an animated talking head commentary system called ByrneDavid Byrne is the lead singer of the Talking Heads.. The goal of this project is to…
KBKim BinstedIn this paper we examine how the differences in modelling between different data driven systems performing the same NLP task can be exploited to yield a higher accuracy than the best individual…
HHHans van Halteren, Jakub Zavrel, Walter DaelemansThis paper explores commonalities and differences between , a variant of Dependency Grammar, and Lexical-Functional Grammar. \ is based on traditional linguistic insights, but on modern…
NBNorbert BroekerWe present several unsupervised statistical models for the prepositional phrase attachment task that approach the accuracy of the best supervised methods for this task. Our unsupervised approach uses…
ARAdwait Ratnaparkhi