Noisy data clusters are hollow
Abstract
A new vision in multidimensional statistics is proposed impacting severalareas of application. In these applications, a set of noisy measurementscharacterizing the repeatable response of a process is known as a realizationand can be seen as a single point in RN. The projections of thispoint on the N axes correspond to the N measurements. The contemporary visionof a diffuse cloud of realizations distributed in RN is replaced bya cloud in the shape of a shell surrounding a topological manifold. Thismanifold corresponds to the process's stabilized-response domain observedwithout the measurement noise. The measurement noise, which accumulates overseveral dimensions, distances each realization from the manifold. Theprobability density function (PDF) of the realization-to-manifold distancecreates the shell. Considering the central limit theorem as the number ofdimensions increases, the PDF tends toward the normal distribution N(μ,σ2) where μ fixes the center shell location and σfixes the shell thickness. In vision, the likelihood of a realization is afunction of the realization-to-shell distance rather than therealization-to-manifold distance. The demonstration begins with the work ofClaude Shannon followed by the introduction of the shell manifold and ends withpractical applications to monitoring equipment.