On the statistical evaluation of algorithmic's computational experimentation with infeasible solutions
Abstract
The experimental evaluation of algorithms results in a large set of data which generally do not follow a normal distribution or are not heteroscedastic. Besides, some of its entries may be missing, due to the inability of an algorithm to find a feasible solution until a time limit is met. Those characteristics restrict the statistical evaluation of computational experiments. This work proposes a bi-objective lexicographical ranking scheme to evaluate datasets with such characteristics. The output ranking can be used as input to any desired statistical test. We used the proposed ranking scheme to assess the results obtained by the Iterative Rounding heuristic (IR). A Friedman's test and a subsequent post-hoc test carried out on the ranked data demonstrated that IR performed significantly better than the Feasibility Pump heuristic when solving 152 benchmark problems of Nonconvex Mixed-Integer Nonlinear Problems. However, is also showed that the RECIPE heuristic was significantly better than IR when solving the same benchmark problems.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.