DAME: A Distributed Data Mining & Exploration Framework within the Virtual Observatory
Abstract
Nowadays, many scientific areas share the same broad requirements of being able to deal with massive and distributed datasets while, when possible, being integrated with services and applications. In order to solve the growing gap between the incremental generation of data and our understanding of it, it is required to know how to access, retrieve, analyze, mine and integrate data from disparate sources. One of the fundamental aspects of any new generation of data mining software tool or package which really wants to become a service for the community is the possibility to use it within complex workflows which each user can fine tune in order to match the specific demands of his scientific goal. These workflows need often to access different resources (data, providers, computing facilities and packages) and require a strict interoperability on (at least) the client side. The project DAME (DAta Mining & Exploration) arises from these requirements by providing a distributed WEB-based data mining infrastructure specialized on Massive Data Sets exploration with Soft Computing methods. Originally designed to deal with astrophysical use cases, where first scientific application examples have demonstrated its effectiveness, the DAME Suite results as a multi-disciplinary platform-independent tool perfectly compliant with modern KDD (Knowledge Discovery in Databases) requirements and Information & Communication Technology trends.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.