- Open Access
Chi-p distribution: characterization of the goodness of the fitting using Lp norms
Journal of Statistical Distributions and Applications volume 1, Article number: 4 (2014)
This paper derives (1) the Chi-p distribution, i.e., the analog of Chi-square distribution but for datasets that follow the General Gaussian distribution of shape p, and (2) develops the statistical test for characterizing the goodness of the fitting with Lp norms. It is shown that the statistical test has double role when the fitting method is induced by the Lp norms: For given the shape parameter p, the test is rated based on the estimated p-value. Then, a convenient characterization of the fitting rate is developed. In addition, for an unknown shape parameter and if the fitting is expected to be good, then those Lp norms that correspond to unlikely p-values are rejected with a preference to the norms that maximized the p-value. The statistical test methodology is followed by an illuminating application.
The fitting of a given dataset to the values of a statistical model V(X; α) in the domain (McCullagh2002; Adèr2008), involves finding the optimal parameter value α = α* in that minimizes the total square deviations (TSD) between model and data,
where the inverse of the variance of the data measurements is weighting the summation. The deviations may be also defined using the total absolute deviations (TAD),
A class of generalized fitting methods has been considered by Livadiotis (2007), using the metric induced by the p-norms Lp, p ≥ 1, that denotes a complete normalized vector space with finite Lebesgue integral. The total deviations (TD) are now defined by
The least square method based on the Euclidean norm, p = 2, and the least absolute deviations method based on the “Taxicab” norm, p = 1, are some cases of the general fitting methods based on the Lp-norms (see Burden and Faires1993; for more applications of the fitting methods based on Lp, see: Sengupta1984; Livadiotis and Moussas2007; Livadiotis2008;2012; for fitting methods based on other effect sizes e.g., correlation, see: Livadiotis and McComas2013a).
The goodness of the least square fitting is typically measured using the estimated Chi-square value, that is the least squared value,. Then, this is compared with the Chi-square distribution, to examine whether such a value is frequent or not (see next sections). However, this test can apply only to datasets that follow the normal distribution. There is no similar test for cases where the dataset follows the General Gaussian distribution of shape p, (see Section 2 and Appendix A). Livadiotis (2012) showed the connection between the fitting with Lp norms, as in Eq. (3), and datasets that follow the General Gaussian distributions,.
The purpose of this paper is to (1) construct the formulation of the Chi-p distribution, the analog of Chi-square distribution but for datasets that follow the General Gaussian distribution of shape p, and (2) develop the statistical test for characterizing the goodness of the fitting with Lp norms, which corresponds to datasets that follow the General Gaussian distribution of shape p. Therefore, in Section 2, we revisit the Chi-square derivation, and following similar steps, we construct the Chi-p distribution. In Section 3, we develop the statistical test for characterizing the goodness of the fitting with Lp norms, using the Chi-p distribution and the p-value. In Section 4, we provide an application of the statistical test. Finally, in Section 5, we summarize the conclusions. Appendix A briefly describes the General Gaussian distribution, while Appendix B shows the mathematical derivation of the surface of the sphere of higher dimensions in Lp space.
2. Chi-p distribution
We first revisit the derivation of Chi-square distribution. This distribution is necessary to test the goodness of fitting of measurements that follow the Gaussian distribution. This test applies to datasets that follow the normal distribution. The Chi-square is given by
that is the sum of squares of N independent random variables. The distribution of this sum is given by
The estimated value of the Chi-square for a fitting is given by the minimum at α = α* of the function χ2(α) = TSD(α)2, as shown in Eq. (1) (least squares). Considering that the Chi-square minimum, χ2(α*), is equivalently referred to all the M = N-1 degrees of freedom (for N number of data), then each of them contributes to this minimum by a factor of. This is the estimated value of the reduced Chi-square. For multi-parametrical fitting (Livadiotis2007) of n free parameters, the degrees of freedom are M = N-n. In general, the Chi-square distribution in Eq. (5) is referred to M degrees of freedom.
For testing the goodness of fitting of measurements that follow the General Gaussian distribution of shape p, x i ~ GG(μ xi , σ xi , p), we need to construct the Chi-p distribution connected with Lp fitting methods, where the minimization of χp(α) is given by Eq. (3). The General Gaussian distribution of shape p, (Appendix A). This distribution is parameterized by the mean μ, the variance σ, and the shape parameter p,
where the involved coefficients are
Figure 1 depicts the distribution for various shape parameters p. Note that the normalized coefficient C p is derived by setting, while the exponential coefficient η p is derived so that the Lp-normed variance to equal σ2. The theory of Lp-normed mean and variance was developed by Livadiotis (2012), which for the case of the General Gaussian distribution (6) leads to the following Propositions:
Proposition 1: The L p-normed mean of the distribution (6) is < x > p = μ, ∀ p ≥ 1.
Proposition 2: The L p-normed variance of the distribution (6) is , ∀ p ≥ 1.
The proofs of the two Propositions are shown in Appendix A.
We continue with the development of the Chi-p distribution. We start with the following Lemma:
Lemma 1: The surface of the N-dimensional sphere of unit radius in L p space is given by(8)
The proof is shown in Appendix B.
The Chi-p is given by the sum of absolute values to the exponent p of N independent random variables,
For M degrees of freedom (M = N-n, N number of data, n number of independent variables), the Chi-p distribution is given by
where the estimated Chi-p value X is given by the minimum at α = α* of the function χp(α) = TD(α)p, as shown in Eq. (3) (least Lp deviations). Figure 2 plots the Chi-p distribution for various values of the shape parameter p (that correspond to various Lp norms).
Proof of Theorem 1. The distribution of Chi-p can be derived as follows. The normalization of the joint distribution function of all the data is(11)
By setting, we derive
where we denote, and is the surface of the N-dimensional sphere of unit radius in Lp space (Lemma 1), so that
where we have used the identity. Hence, we find
In general, for M degrees of freedom, the Chi-p distribution is given by Eq. (10).
3. Statistical test of a fitting
In order to estimate the goodness of the fitting, we minimize the Chi-p, χp,
similar to the minimization of the Chi-square, χ2, for the case of the Euclidean norm,
We begin with the established method of Chi-square, and then we will proceed to the generalized method of Chi-p.
The goodness of a fitting can be estimated by the reduced Chi-square value,, where M = N-1 indicates again the degrees of freedom. The meaning of is the portion of χ2 that corresponds to each of the degrees of freedom, and this has to be ~1 for a good fitting. We can easily understand this, for example, when the given data have equal error σ f , with, i.e., for all i = 1,...., N. Then, the optimized model value, V(x i ; α*), gives the expected value of the data point f i , so that the variance can be approached by (sample variance). Hence, the derived Chi-square becomes, and its reduced value. Therefore, a fitting can be characterized as "good" when, otherwise there is an overestimation,, or underestimation,, of the errors. When the deviations of the data from the model values are small, the fitting is expected to be good. However, this characterization is meaningless if the errors of the data are either (i) quite larger than their deviations from the model values, i.e., if σ fi > > |f i - V(x i ; α)|, or (ii) quite smaller, i.e., if σ fi < < |f i - V(x i ; α)| (e.g., see Figure 3). Then, a perfect matching between data and model is useless when the errors of the data are comparably large or small.
Furthermore, a better estimation of the goodness is derived from comparing the calculated χ2 value and the Chi-square distribution, that is the distribution of all the possible χ2 values for data with normally distributed errors (parameterized by the degrees of freedom M),
(e.g., see Melissinos1966). The likelihood of having an χ2 value equal to or smaller than the estimated value, is given by the cumulative distribution
where is the incomplete Gamma function. In addition, the likelihood of having an χ2 value equal to or larger than the estimated value, is given by the complementary cumulative distribution
The probability of having a result χ2 larger than the estimated value, defines the p-value that equals. The larger the p-value, the better the fitting is (e.g., Melissinos1966). However, the p-value test fails when p > 0.5. Indeed, p-values larger than 0.5 correspond to or. Even larger p-values, up to p = 1, correspond to even smaller Chi-squares, down to. Thus, an increasing p-value above the threshold of 0.5 cannot lead to a better fitting but to a worse, similar to the indication. For this reason, we use the "p-value of the extremes". According to this, the probability of taking a result χ2, more extreme than the observed value is given by the p-value that equals the minimum between and, i.e.,
(see some applications in Livadiotis and McComas2013b; Frisch et al.2013; Funsten et al.2013). Note that the maximum p-value is 0.5, and this corresponds to the estimated Chi-square. This is larger than the Chi-square that maximizes the distribution,. Hence,, i.e., the Chi-square that corresponds to p-value = 0.5, is located always at the right of the maximum.
The statistical test of the fitting for the evaluation of its goodness comes from the null hypothesis that the given data are described by the fitted statistical model. If the derived p-value is smaller than the significance level of ~0.05, then the hypothesis is typically rejected, and the hypothesis that the data are described by the examined statistical model is characterized as unlikely.
A convenient rate for a statistical test is to give more detailed characterization than “likely” when p-value > 0.05, or “unlikely” when p-value < 0.05. For this reason, it is necessary to ascribe an 1–1 relation between the domain of p-values and the range of a rating values, with the correspondence: 1) Impossible; 2) indefinite; 3) certain. Choosing a power-law function,, we find and γ = log 2, i.e.,
We can easily now characterize the testing rates by a linear separation of the values of T, as shown in Table 1.
In the case of data that follow the General Gaussian distribution of shape p, the derived p-value is dependent on the shape p. Indeed, we have
and the p-value that equals the minimum between and, i.e.,
Note that the maximum p-value = 0.5 corresponds to the estimated Chi-square. This is larger than the Chi-square that maximizes the distribution,. Hence, again we find
The statistical test has double role in the case of Lp norms. If the shape parameter p is known, then the test can be rated by deriving the p-value and according to Table 1. If the shape parameter is unknown and the fitting is expected to be good, then all the shape values p that correspond to unlikely p-values can be rejected. In fact, the largest p-value corresponds to the most-likely shape parameter p of the examined data. These are shown in the following applications.
Table 2 contains a dataset of observations of the ratio of the umbral area to the whole sunspot area,, N = 6 (Edwards1957). Assuming that each of them follows a General Gaussian distribution about their mean, f i ~ GG(μ i , σ i , p), what is the likelihood of these measurements to represent a constant physical quantity? Let this constant be indicated by μ p , which can be derived from the fitting of, and thus, it is typically depended on the p-norm. However, different values of the p-norm lead to different estimated values of the Chi-p,. Thus, the p-value of the null hypothesis (Ho) depends also on the p-norm.
We apply a statistical test to examine whether the data of the sunspot area ratios are dependent with heliolatitude on not. Therefore, the null hypothesis is that the dataset is described by the statistical model of constant value, i.e.,. We construct and minimize the Chi-p, given by
so that the Lp-mean value α p = α p (p) is implicitly given by
and the estimated Chi-p is
Figure 4(a) shows the six data points co-plotted with four values of α p , that correspond to p → 1, p → ∞, and the two shape parameter values p1, p2 for which the p-value is equal to 0.05. The whole diagram of α p = α p (p) is shown in Figure 4(b) and the p-value as a function of p is shown in Figure 4(c).
We observe that the function α p is monotonically increasing converging to some constant value for p → ∞. The corresponding mean value, α∞, is given by
The p-value has a minimum value at p ~ 2.08 and increases for larger shape values p until it reaches p ~ 5.77 where becomes p-value ~ 0.5 (not shown in the figure). If the shape p of the dataset is known, e.g., p = 2, then the null hypothesis is rejected, i.e., the sunspot area ratio data are dependent on the heliolatitude. On the other hand, if the data are expected to be invariant with the heliolatitude, and thus the null hypothesis to be accepted, then all the norms between p1 ~ 1.7 and p2 ~ 2.5 are rejected, and the norm Lp with p ~ 5.77 characterizes better these data points; the respective mean value is given by α p (5.77)~0.164. Therefore, if we know the shape/norm p that characterizes the data, we can proceed and rate the goodness of the fitting. However, if p is unknown, at least we could detect those values of p for which the null hypothesis is accepted or rejected.
One of the most intriguing questions regarding the Lp-normed fitting is how can we determine the characteristic p-norm of the data. This is the suitable norm that should be used for the fitting of those data (Livadiotis2007). The maximization of the p-value is one promising method. We demonstrate this as follows. We construct N = 104 data,, of a random variable that follows the General Gaussian distribution of shape p, f i ~ GG(μ = 0, σ = 1, p = 3). Figure 5(a) shows that the normalized histogram of these values matches this General Gaussian distribution. The p-value is approximated using the asymptotic behavior of (complete and incomplete) Gamma functions for large degrees of freedom, M = 9999. Hence, in order to derive the maximum p-value, it is sufficient to maximize
This is shown in Figure 5(b), where the peak is at p ≅ 2.95 ± 0.08. Therefore, the p-value is maximized at the same value of p-norm as the shape of the General Gaussian distribution.
This paper (1) presented the derivation of the Chi-p distribution, the analog of Chi-square distribution but for datasets that follow the General Gaussian distribution of shape p, and (2) developed the statistical test for characterizing the goodness of the fitting with Lp norms, which corresponds to datasets that follow the General Gaussian distribution of shape p.
It was shown that the statistical test has double role in the case of Lp norms: (1) If the shape parameter p is fixed and known, then the test can be rated by deriving the p-value. A convenient characterization of the fitting rate was developed. (2) If the shape parameter is unknown and the fitting is expected to be good for some shape parameter value p, a method for estimating p was given by fitting a General Gaussian distribution of shape p to the data, and then use this estimated shape parameter p to the Chi-p distribution to characterize the goodness of fitting. In particular, all the shape values p that correspond to unlikely p-values can be rejected, while the largest p-value corresponds to the most-likely shape parameter p of the examined data. This was verified by an illuminating example where the method of the fitting based on Lp norms was applied.
Appendix A: General Gaussian distribution
According to the theory of Lp-normed mean and variance, developed by Livadiotis (2012), the Lp-normed mean < x > p of the random variable X with probability distribution P(x), is implicitly defined by
where sign(u) returns the sign of u. The Lp-normed variance is given by
Next, we derive the Lp-normed mean and variance of the General Gaussian distribution (6), which are Propositions 1 and 2, stated in Section 2.
Proposition 1: Given the distribution (6), we have that the L p-normed mean is < x > p = μ, ∀ p ≥ 1.
Proof. We have(A3)
for z ≡ (x - μ)/σ, < z > p ≡ (< x > p - μ)/σ. Let’s assume that < z > p = 0. Then, the left-hand side of Eq.(A3) is
because the integrant is a product of symmetric and antisymmetric function. Then, (A3) is true for < z > p = 0, and given the uniqueness of the Lp-normed mean for each p, we end up with proposition 1. (Note that it is not surprising that the mean, < x > p = μ, is independent of p. Livadiotis (2012) showed that symmetric probability distributions lead to Lp-normed means that are independent of p.)
Proposition 2: Given the distribution (6), we have that the L p-normed variance is , ∀ p ≥ 1.
Proof. We have , i.e.,(A5a)
Hence, from (A2) we obtain
Appendix B: Surface of the N-dimensional sphere in Lpspace, Βp,N
This appendix shows the proof of Lemma 1, stated in Section 2.
Lemma 1: The surface of the N-dimensional sphere of unit radius in L p space, Β p,N, is given by Eq.(8). This is involved in the proof of Chi-p distribution (10), as shown below.
Proof of Lemma 1.
Let the integral
where,. The magnitude Z is the only quantity with dimensions the same as each of the components z i . Indeed, if we define c i ≡ z i /ζ, where is the Euclidean magnitude of, then,, i.e., Z and ζ have the same dimensions. (In the previous sections the components z i were dimensionless by definition, i.e.,. However, we can still use this dimension analysis, since the components z i may have dimensions in the generic case). Hence, we write Eq.(B1) as dz1 … dz N = ZN - 1dZ dN - 1Ω N , i.e.,
where; Ω N symbolizes all the angular dependence, and dN - 1Ω N denotes the angular infinitesimal. Since F(Z; Ω N ) = F(Z), we have, or
The normalization gives, or
Another way to show Eq.(B4) is through the integration of all the components,
by substituting and (for z i ≥ 0). The integration range, z i ≥ 0, means for i = 1,…, N-1, and 0 ≤ z N ≤ Z. Similar, we have
Hence, we derive
while, on the other hand, we have
We easily find that
where B(x, y) ≡ Γ(x)Γ(y)/Γ(x + y) is the Beta function. Hence, we have
Since,, finally, we end up with Eq.(B4).
Adèr HJ: Modelling (Chapter 12). In Advising on Research Methods: A consultant’s companion. Edited by: with contributions by D.J. Hand, Adèr HJ, Mellenbergh GJ. Huizen, The Netherlands: Johannes van Kessel Publishing; 2008:271–304.
Burden RL, Faires JD: Numerical Analysis. Boston, MA: PWS Publishing Company; 1993:437–438.
Edwards AWF: The proportion of umbra in large sunspots, 1878–1954. The Observatory 1957, 77: 69–70.
Frisch PC, Bzowski M, Livadiotis G, McComas DJ, Mӧbius E, Mueller HR, Pryor WR, Schwadron NA, Sokól JM, Vallerga JV, Ajello JM: Decades-long changes of the interstellar wind through our solar system. Science 2013, 341: 1080. 10.1126/science.1239925
Funsten HO, Frisch PC, Heerikhuisen J, Higdon DM, Janzen P, Larsen BA, Livadiotis G, McComas DJ, Mӧbius E, Reese CS, Reisenfeld DB, Schwadron NA, Zirnstein E: The circularity of the IBEX Ribbon of enhanced energetic neutral atom flux. Astrophys. J. 2013, 776: 30. 10.1088/0004-637X/776/1/30
Livadiotis G: Approach to general methods for fitting and their sensitivity. Physica A 2007, 375: 518–536. 10.1016/j.physa.2006.09.027
Livadiotis G: Approach to the block entropy modeling and optimization. Physica A 2008, 387: 2471–2494. 10.1016/j.physa.2008.01.002
Livadiotis G: Expectation values and Variance based on Lp norms. Entropy 2012, 14: 2375–2396. 10.3390/e14122375
Livadiotis G, McComas DJ: Fitting method based on correlation maximization: Applications in Astrophysics. J. Geophys. Res. 2013, 118: 2863–2875.
Livadiotis G, McComas DJ: Evidence of large scale phase space quantization in plasmas”. Entropy 2013, 15: 1116–1132.
Livadiotis G, Moussas X: The sunspot as an autonomous dynamical system: A model for the growth and decay phases of sunspots. Physica A 2007, 379: 436–458. 10.1016/j.physa.2007.02.003
McCullagh P: What is statistical model? Ann. Stat. 2002, 30: 1225–1310.
Melissinos AC: Experiments in Modern Physics. London, UK: Academic Press Inc; 1966:464–467.
Sengupta A: A rational function approximation of the singular eigenfunction of the monoenergetic neutron transport equation. J. Phys. A 1984, 17: 2743–2758. 10.1088/0305-4470/17/14/018
The authors declare that they have no competing interests.