Synthesizing Scoring Functions for Rankings Using Symbolic Gradient Descent
Abstract
Given a relation and a ranking of its tuples, but no information about the ranking function, we are interested in synthesizing simple scoring functions that reproduce the ranking. Our system RankHow identifies linear scoring functions that minimize position-based error, while supporting flexible constraints on their weights. It is based on a new formulation as a mixed-integer linear program (MILP). While MILP is NP-hard in general, we show that RankHow is orders of magnitude faster than a tree-based algorithm that guarantees polynomial time complexity (PTIME) in the number of input tuples by reducing the MILP problem to many linear programs (LPs). We hypothesize that this is caused by 2 properties: First, the PTIME algorithm is equivalent to a naive evaluation strategy for the MILP program. Second, MILP solvers rely on advanced heuristics to reason holistically about the entire program, while the PTIME algorithm solves many sub-problems in isolation. To further improve RankHow's scalability, we propose a novel approximation technique called symbolic gradient descent (Sym-GD). It exploits problem structure to more quickly find local minima of the error function. Experiments demonstrate that RankHow can solve realistic problems, finding more accurate linear scoring functions than the state of the art.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.