Efficient Calculation of Bigram Frequencies in a Corpus of Short Texts
Abstract
We show that an efficient and popular method for calculating bigram frequencies is unsuitable for bodies of short texts and offer a simple alternative. Our method has the same computational complexity as the old method and offers an exact count instead of an approximation.
0
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.