Piecewise Linear Approximation in Learned Index Structures: Theoretical and Empirical Analysis

Abstract

A growing trend in the database and system communities is to augment conventional index structures, such as B+-trees, with machine learning (ML) models. Among these, error-bounded Piecewise Linear Approximation (ε-PLA) has emerged as a popular choice due to its simplicity and effectiveness. Despite its central role in many learned indexes, the design and analysis of ε-PLA fitting algorithms remain underexplored. In this paper, we revisit ε-PLA from both theoretical and empirical perspectives, with a focus on its application in learned index structures. We first establish a fundamentally improved lower bound of ( · ε2) on the expected segment coverage for existing ε-PLA fitting algorithms, where is a data-dependent constant. We then present a comprehensive benchmark of state-of-the-art ε-PLA algorithms when used in different learned data structures. Our results highlight key trade-offs among model accuracy, model size, and query performance, providing actionable guidelines for the principled design of future learned data structures.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…