A Tool for Automatically Cataloguing and Selecting Pre-Trained Models and Datasets for Software Engineering

Abstract

The rapid growth of machine learning assets has made it increasingly difficult for software engineers to identify models and datasets that match their specific needs. Browsing large registries, such as Hugging Face, is time-consuming, error-prone, and rarely tailored to Software Engineering (SE) tasks. We present MLAssetSelection, a web application that automatically extracts SE assets and supports four key functionalities: (i) a configurable leaderboard for ranking models across multiple benchmarks and metrics; (ii) requirements-based selection of models and datasets; (iii) real-time automated updates through scheduled jobs that keep asset information current; and (iv) user-centric features including login, personalized asset lists, and configurable alert notifications. A demonstration video is available at https://youtu.be/t6CJ6P9asV4.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…