A statistical technique for cleaning option price data

Abstract

Recorded option pricing datasets are not always freely available. Additionally, these datasets often contain numerous prices which are either higher or lower than can reasonably be expected. Various reasons for these unexpected observations are possible, including human error in the recording of the details associated with the option in question. In order for the analyses performed on these datasets to be reliable, it is necessary to identify and remove these options from the dataset. In this paper, we list three distinct problems often found in recorded option price datasets alongside means of addressing these. The methods used are justified using sound statistical reasoning and remove option prices violating the standard assumption of no arbitrage. An attractive aspect of the proposed technique is that no option pricing model-based assumptions are used. Although the discussion is restricted to European options, the procedure is easily modified for use with exotic options as well. As a final contribution, the paper contains a link to six option pricing datasets which have already been cleaned using the proposed methods and can be freely used by researchers.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…