TMRugPull: A Temporally Sound Multimodal Dataset for Early RugPull Detection
Abstract
Rug pull is a critical attack in the world of blockchain technology. Despite this, the absence of sufficient time-bound and well-structured datasets is considered one of the significant issues faced while identifying early detection. Existing datasets do not provide the solution to this challenge because of temporal leakage or use of post-collapse indicators, insufficient modality coverage, and confusing or partial labels, especially with regards to DeFi tokens. To solve these problems, we present a highly curated and strictly time-bound dataset called TM-RugPull containing 1,000 projects, which include DeFi, meme, NFT, and celebrity token projects. We achieve temporal validation of the dataset by acquiring all three modalities, namely on-chain behavior, smart contract metadata, and OSINT signals. The project labels are provided based on manual investigation for the entire project's lifespan and its collapse. Also, we make our dataset publicly available together with its codebase for data acquisition and feature extraction.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.