WWW Spiders: an introduction
Abstract
In recent years, the study of complex networks has received a lot of attention. Real systems have gained importance in scientific publications, despite of an important drawback: the difficulty of retrieving and manage such great quantity of information. This paper wants to be an introduction to the construction of spiders and scrapers: specifically, how to program and deploy safely these kind of software applications. The aim is to show how software can be prepared to automatically surf the net and retrieve information for the user with high efficiency and safety.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.