Histogram Filter with Adjustment of the Smoothing Parameter Based on The Minimization of the Chi-Square Test
For the formation of adequate models of objects of statistical research, with the possible high cost of a measuring experiment or the process of obtaining data, fast and “correct” identification (recognition) of the probability distribution density (PDD) based on the construction of simple histogram estimates is required. The requirement for rapid identification can be considered equivalent to having a limited and small amount of data. The article proposes a theoretically substantiated method for constructing a histogram filter (HF), which is a linear combination of the amount of data in adjacent intervals with constant weight coefficients, which can be expressed in terms of a single coefficient k – the smoothing parameter. The estimation of the smoothing coefficient is based on the minimization of the modified chi-square test. The theorem given in the article establishes that the value of the mathematical expectation of the chi-square test, after applying the HF, decreases by k times compared to the standard mathematical expectation of the criterion with a unit inclusion function. The smoothing coefficient is determined by a complex dependence of the number of data, parameters of the identified PDD (Fisher information coefficients of the first and second order) and HF (number and width of grouping intervals). The article shows that the relationship between the number of data, the number and width of grouping intervals is non-linear and has only a numerical solution. The considered examples of modeling the work of the HF characterize the effectiveness of the identification of the PDD, the expediency of its application in scientific and applied statistical research.