Persian topic detection based on Human Word association and graph embedding

Abstract

In this paper, we propose a framework to detect topics in social media based on Human Word Association. Identifying topics discussed in these media has become a critical and significant challenge. Most of the work done in this area is in English, but much has been done in the Persian language, especially microblogs written in Persian. Also, the existing works focused more on exploring frequent patterns or semantic relationships and ignored the structural methods of language. In this paper, a topic detection framework using HWA, a method for Human Word Association, is proposed. This method uses the concept of imitation of mental ability for word association. This method also calculates the Associative Gravity Force that shows how words are related. Using this parameter, a graph can be generated. The topics can be extracted by embedding this graph and using clustering methods. This approach has been applied to a Persian language dataset collected from Telegram. Several experimental studies have been performed to evaluate the proposed framework's performance. Experimental results show that this approach works better than other topic detection methods.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…