Bayesian reference analysis for exponential power regression models
© Ferreira and Salazar; licensee Springer. 2014
Received: 17 October 2013
Accepted: 11 April 2014
Published: 17 June 2014
We develop Bayesian reference analyses for linear regression models when the errors follow an exponential power distribution. Specifically, we obtain explicit expressions for reference priors for all the six possible orderings of the model parameters and show that, associated with these six parameters orderings, there are only two reference priors. Further, we show that both of these reference priors lead to proper posterior distributions. Furthermore, we show that the proposed reference Bayesian analyses compare favorably to an analysis based on a competing noninformative prior. Finally, we illustrate these Bayesian reference analyses for exponential power regression models with applications to two datasets. The first application analyzes per capita spending in public schools in the United States. The second application studies the relationship between sold home videos versus profits at the box office.
62F15; 62F35; 62J05
KeywordsBayesian inference Exponential power errors Frequentist properties Reference prior Robustness
A flexible way to deal with outliers in linear regression is to assume that the errors follow an exponential power (EP) distribution. Specifically, assuming an EP distribution decreases the influence of outliers and, as a result, increases the robustness of the analysis (Box and Tiao 1962; Liang et al. 2007; Salazar et al. 2012; West 1984). In addition, the EP distribution includes the Gaussian distribution as a particular case. Further, the EP distribution may have tails either lighter (platykurtic) or heavier (leptokurtic) than Gaussian. Platykurtic distributions may be a result of truncation, whereas leptokurtic distributions provide protection against outliers. Salazar et al. (2012) have developed three types of Jeffreys priors for linear regression models with independent EP errors. Unfortunately, two of those priors lead to useless improper posterior distributions and only one leads to a proper posterior distribution. Here we develop explicit expressions for reference priors for all the six possible orderings of the model parameters.
We show that the six parameters orderings lead to two distinct reference priors. The parameter ordering corresponds to the order of importance of each parameter in the analysis, with the most important parameter appearing first and the least important appearing last (Berger and Bernardo 1992a,1992b). In addition to the two formally obtained reference priors, we propose an approximate reference prior that shares the same tail behavior but is much more straightforward to implement in practice. Finally, we show that the two reference priors lead to useful proper posterior distributions.
To make sure that Bayesian reference procedures do not bias the data analysis in an undesirable manner, it is important to study their frequentist properties. To study the frequentist properties of our proposed procedures, we have performed a Monte Carlo study that shows that our proposed Bayesian reference approaches compare favorably to a posterior analysis based on a competing prior in terms of coverage of credible intervals, relative mean squared error, and mean length of credible intervals. While the relative mean squared error and the mean length of credible intervals should be judged in comparison with those yielded by competing priors, the coverage of credible intervals should be as close as possible to the nominal level.
Coverage of credible intervals close to nominal provides a guarantee of level of performance of the procedure when used automatically and independently by many researchers in their problems. In our Monte Carlo study, we have found that the Bayesian reference credible intervals that we have obtained have frequentist coverage close to nominal. These good frequentist properties results agree with previous literature on Bayesian reference analyses for other models such as, for example, Gaussian random fields (Berger et al. 2001), Markov random fields (Ferreira and De Oliveira 2007), multivariate normal models (Sun and Berger 2007), and elapsed times in continuous-time Markov chains (Ferreira and Suchard 2008).
where p>1, −∞<μ<∞ and σ p >0. The EP distribution has three parameters: the location parameter μ=E(y), the scale parameter σ p = [ E(|y−μ| p )]1/p, and the shape parameter p. The scale parameter σ p can be seen as a variability index that generalizes the standard deviation. Moreover, σ p is also known as power deviation of order p (Vianelli 1963). In addition, the kurtosis is κ=Γ(1/p)Γ(5/p)/(Γ(3/p))2, implying that the shape parameter p determines the thickness of the tails of the EP density. Specifically, the EP distribution is leptokurtic if p<2 (κ>3) and platykurtic if p>2 (κ<3). Finally, the EP distribution has several important especial cases such as the Laplace distribution (p=1), the normal distribution (p=2) and, when p→∞, the uniform distribution on the interval (μ−σ p ,μ+σ p ) (e.g., see Box and Tiao 1992).
There are just some few Bayesian procedures for the analysis of EP regression models published to date. Moreover, there are no published reference priors for EP regression models. Existing literature has considered the use of EP errors in a number of contexts such as, for example, EP errors to robustify linear models (Box and Tiao 1992; Salazar et al. 2012), and mixtures of regression models with EP errors (Achcar and Pereira 1999). In addition, the EP distribution has been used as a prior for a Gaussian model location parameter (Choy and Smith 1997). To implement simulation-based computation for models with EP errors, one may use representations of the EP distribution as a scale mixture of normals (West 1987) or as a scale mixture of uniforms (Walker and Gutiérrez-Peña 1999). As an alternative, Salazar et al. (2012) have developed fast analysis for EP regression models using Laplace approximations and Newton-Cotes integration. Here we use these latter fast computational methods.
The remainder of the paper is organized as follows. Section 2 presents the linear model with exponential power errors and the associated likelihood function. Section 3 derives the two reference priors and shows that both of these priors lead to proper posterior distributions. Section 4.1 presents a simulation study of the frequentist properties of the reference-priors-based Bayesian procedures and those of a competing noninformative prior. Section 4.2 presents applications of Bayesian reference analysis to two datasets. Section 5 concludes with a discussion of major findings and possible future research directions.
2 EP linear model
We use the log-likelihood function to develop reference priors for the EP regression model.
In this section, we obtain explicit expressions for reference priors for all the six possible orderings of the parameters of the EP linear model, and show that associated with these six parameters orderings there are only two reference priors. Finally, we show that both of these reference priors lead to proper posterior distributions.
Specifically, we consider here the Bernardo reference priors (Bernardo 1979) that take into account the Kulback-Leibler divergency between the prior distribution and the posterior distribution. In a nutshell, the reference priors proposed by Bernardo maximize the expected value of perfect information about the model parameters (p. 300, Bernardo and Smith 1994). When the parameter space is one-dimensional and asymptotic normality of the posterior distribution holds, the reference prior coincides with Jeffreys prior (Jeffreys 1961). However, when the parameter space is multidimensional Jeffreys prior is known to lead to Bayesian procedures that may have undesirable frequentist properties, such as for example frequentist coverage of credible intervals far away from the desired nominal level.
For the multidimensional parameter case when the parameters may be partitioned in a block of parameters of interest and another block of nuisance parameters, Bernardo (1979) suggested an approach in three stages. The first stage obtains the conditional distribution of the nuisance parameter conditional on the parameter of interest. The second stage integrates out the nuisance parameter with respect to that conditional distribution to obtain a marginal likelihood. Finally, the third stage applies the reference prior approach to the marginal likelihood to obtain the reference prior for the parameter of interest. This idea can be naturally extended to partitions of the parameter vector with more than two components. The resulting reference prior will then depend on the ordering of the parameter vector components. This multiparameter case has been developed in a series of papers by Berger and Bernardo (1992a,1992b,1992c). Here we use the Berger-Bernardo approach to develop reference priors for the parameters of the EP regression model.
In what follows we find that the reference priors for the EP regression model are related to the independence Jeffreys priors given in Equations (5) and (6). When developing noninformative priors, it is crucial to study whether the resulting posterior distribution is proper. Salazar et al. (2012) have shown that the Independence Jeffreys prior yields a proper posterior distribution. Unfortunately, both the independence Jeffreys prior and the Jeffreys-rule prior π J (p) yield improper posterior distributions.
where Ψ(α)≡Γ′(α)/Γ(α) and Ψ′(α)≡∂ Ψ(α)/∂ α are the digamma and trigamma functions, respectively.
The Fisher information matrix is block diagonal, with one block corresponding to β and another block corresponding to (σ,p). One of the consequences of this structure is that reference priors that consider β, σ, and p as three separate groups will depend on the ordering of the groups only with respect to whether σ or p appears first in the ordering. The following theorem provides reference priors for the parameters of the EP regression model.
See the Appendix. □
While reference prior is a new prior that has not appeared before in the literature, there are similarities between the reference priors given in Theorem 1 and the independence Jeffreys priors given in Equations (5) and (6). Reference prior coincides with the independence Jeffreys prior given in Equation (6). Moreover, it is important to point out that reference prior is somewhat similar to the independence Jeffreys prior given in Equation (5), differing only by a factor of p−1/2. However, as we show below this difference between and is enough to make yield a useless improper posterior distribution while the reference prior yields a useful proper posterior distribution.
Thus, in order to determine whether a prior of the form (4) leads to a proper posterior distribution, one needs to investigate the tail behavior of both the marginal prior and the integrated likelihood for p. The tail behavior of the marginal reference priors for p given in Theorem 1 is given in the following lemma.
The marginal priors for p given in Theorem 1 are continuous functions in [ 1,∞) and are such that and as p→∞.
Direct inspection shows that and are continuous functions in [ 1,∞). Their tail behavior when p→∞ follows from the fact that Ψ′(1+p−1)→1.6449 and Γ(p−1)=O(p) as p→∞. □
Theorem 1 and Lemma 1 suggest the definition of an approximate reference prior inspired by priors and that has the same value for the hyperparameter a=1 and share their tail behavior with respect to p. We define such an approximate reference prior in Definition 1
We define an approximate reference prior to be of the form (4) with a=1 and marginal prior for p equal to .
Computation of prior is faster and more straightforward than that of priors and . In addition, Section 4.1 shows that the frequentist properties of procedures based on are similar to those based on and . As a consequence, the approximate reference prior may become more widely used than the reference priors and . Therefore, henceforth we drop the term “approximate” and simply refer to as a reference prior.
The following lemma, that was proved by Salazar et al. (2012), provides the tail behavior for the integrated likelihood for p.
Lemma 2 (Salazar et al.2012)
Provided that n>k+1−a, the integrated likelihood for p under the class of priors (4) is a continuous function in [ 1,∞) and is such that L I (p;y)=O(1) as p→∞.
The following proposition establishes that the two reference priors that we have obtained yield proper posterior distributions.
Provided that n>k+1−a, the two reference priors and given in Theorem 1 yield proper posterior distributions.
This proposition follows directly from condition (11), and Lemmas 1 and 2. □
To implement posterior analysis for the parameters of the EP regression model based on the reference priors developed here, we use an approach proposed by Salazar et al. (2012) that combines Laplace approximations and Newton-Cotes integration.
4 Results and discussion
4.1 Frequentist properties
In this section we perform a simulation study to access the frequentist properties of Bayesian procedures based on the reference priors , , and . In addition, we compare the performance of these reference priors to that of a competing noninformative prior π U that takes the form (4) with a=1 and π U (p)∝1 for 1<p<10 and π U (p)=0 otherwise. The joint prior π U (θ) leads to a proper posterior distribution, however as we see below the uniform prior π U (p) is a naïve way to express lack of information about p. The Bayesian procedures we consider are the posterior modes and posterior medians for point estimation, and the 95% highest posterior density (HPD) credible intervals for interval estimation. Finally, we consider three frequentist measures of quality. For evaluating the quality of point estimation, we consider the square root of the frequentist relative mean squared error. For evaluating the performance of interval estimation, we consider two frequentist measures: the frequentist coverage and the mean length of the credible intervals.
We have considered several combinations of sample sizes and parameters. Specifically, we have considered three sample sizes: n=30, n=50 and n=100. Moreover, we have considered a grid of values for p on the interval from 1 to 3. Further, for each simulated dataset we have used k=2, x i =(1,x1i), x1i∼N(2,1), β=(1.5,−3), and σ=1. Finally, for each combination of parameter values and sample sizes, we have simulated 1,500 datasets to estimate the frequentist properties of the several procedures.
Second, we compare the RMSE performance of the different priors. For each type of point estimator considered here, in terms of RMSE the reference priors , , and provide qualitatively similar results, with and being slightly better for smaller values of p and being slightly better for larger values of p. In addition, the difference in performance of the three reference priors becomes smaller as the sample size increases. In contrast, the performance of the reference priors differs dramatically from that of the π U prior. For each class of estimators of p and for all values of p considered, when compared to the π U prior the reference priors lead to smaller RMSE. For the estimation of σ, the results are mixed; for small sample sizes while the reference priors lead to smaller RMSE when p is small and π U leads to better results when p is larger. But for larger sample sizes the reference priors-based posterior medians have smaller RMSE for all considered values of p.
In summary, the reference priors , , and lead to procedures that have similar frequentist properties. In addition, when compared to the competing noninformative prior π U , the reference priors , , and lead to overall superior results. Finally, the reference prior has a simpler functional form and is more straightforward to be implemented. Therefore, in cases when there is no prior information for the analysis of EP linear regression models, we recommend the use of the reference prior .
This section illustrates the use of the Bayesian reference analysis we propose for exponential power regression models with applications to two real world datasets. The first dataset illustrates leptokurtic errors and the second dataset illustrates platykurtic errors. Because the results based on the reference priors and are extremely similar, we show only the results for priors , , and π U .
In both applications, we use the same truncation point at p=10 used for π U (p) in Section 4.1 and assume π U (p)∝1 for 1<p<10 and π U (p)=0 otherwise. We have chosen the truncation point at p=10 because datasets generated with p=10 or with p close to 10 have similar statistical behavior. Hence, to distinguish whether a process follows an EP distribution with p=10 or, say, p=10.1 we would need an extremely large data set. Moreover, the choice of truncation should be made before the analyst looks at the data. For example, for the first application below, after looking at the scatterplot one may think about truncating the prior for values of p that correspond to leptokurtic distributions, that is, 1<p<2. However, doing that would mean to use the data twice in the Bayes Theorem formula: once through the prior, and another time through the likelihood. Usually, such double use of the data leads to underestimation of the uncertainty. Therefore, we prefer to decide the truncation of the prior before looking at the data.
4.2.1 School spending
We analyze the relationship between per capita spending in public schools and per capita income by state in the United States. This dataset has been previously analyzed by Greene (1997), Cribari-Neto et al. (2000), and Fonseca et al. (2008). Specifically, Greene (1997) and Cribari-Neto et al. (2000) proposed analyses based on heuristic approaches to the so-called problem of heterocedasticity-of-unknown-form. In contrast, Fonseca et al. (2008) have analyzed this dataset in the context of linear regression models with Student-t errors. Fonseca et al. (2008) found that when errors with distributions with heavy tails are assumed, a linear model is superior to a quadratic model. Here, we take a similar approach as that of Fonseca et al. (2008) in that we assume a linear model with errors that may have a heavy tail distribution. However, we assume that the errors follow an exponential power distribution.
School spending data set: Posterior summaries based on the noninformative prior π U and the reference priors and
Figure 4(b) presents the marginal posterior densities for p based on (solid line), (dashed line), (long-dashed line) and π U (dotted line). In addition, the vertical lines indicate the limits of the 95% HPD credible intervals. The three reference priors lead to similar posterior densities for p, while the π U prior leads to a substantially different posterior density for p. Figure 4(b) illustrates why the π U leads to unnecessarily wider credible intervals. That combined with π U -based credible intervals having coverage lower than nominal leads us to prefer the data analysis based on the reference priors.
4.2.2 Sold home videos vs. profits at the box office
Videos data set: Posterior summaries based on the noninformative prior π U and the reference priors and
The reference analyses for p are also strikingly distinct from the π U -based analysis for p. First, the posterior medians for p based on and coincide and are equal to 2.64 while the π U -based posterior median differs tremendously and is equal to 4.36. Second, the 95% credible intervals for p based on and are similar and equal to (1.00,7.01) and (1.00,7.18) respectively, while the π U -based interval for p differs tremendously from the reference CIs and is equal to (1.36,9.64). Hence, the π U -based CI is more than 30% wider than the reference CIs. This undesirable feature of π U -based CIs coincides with the results from the simulation study presented in Section 4.1.
Finally, Figure 5(b) presents the marginal posterior densities for p based on , , and π U . This figure sheds light on the reason for the striking difference between the - and -based CIs and the π U -based CI. The problem with the π U -based analysis is that the right tail of the marginal posterior density for p decays too slowly. As a result, for the home video dataset the π U -based CI depends dramatically on the right side truncation of the prior, which in this manuscript has been fixed at 10. Figure 5(b) makes it really clear that a larger truncation point would have a huge impact in the resulting π U -based CI for p. This dataset clearly illustrates the superiority of the Bayesian reference analyses.
We have developed Bayesian reference analysis for linear models with exponential power errors. Specifically, we have developed three reference priors that lead to useful proper posterior distributions. In addition, we have shown through a simulation study that both priors yield procedures that have better frequentist properties than procedures resulting from a competing noninformative prior. Finally, we have illustrated our Bayesian reference analysis methodology with two real world applications that highlight the flexibility of the exponential power distribution to accommodate both cases when there are outliers in the dataset and also cases when the errors follow a platykurtic distribution.
The fact that the reference priors we have obtained for the EP regression model lead to proper posterior distributions is of substantial theoretical interest. The propriety of these reference posterior distributions contrasts with the impropriety of the posterior distribution associated with the Jeffreys-rule prior found by Salazar et al. (2012). Moreover, Salazar et al. (2012) found two independence Jeffreys priors, one of which leads to an improper posterior distribution whereas the other leads to a proper posterior distribution. We have found that the independence Jeffreys prior that yields a proper posterior distribution coincides with our reference prior . Further, the independence Jeffreys prior that yields a useless improper posterior distribution differs only by a factor of p−1/2 from the reference prior . However, this difference is enough to make our reference prior yield a useful proper posterior distribution.
Our results motivate many possible directions for future research. First, an open question is whether there exist general conditions under which reference priors yield proper posterior distributions. In addition, the existence of general conditions for posterior propriety may be investigated for Jeffreys-rule and independence Jeffreys priors. The search of general conditions for posterior propriety may benefit from our present work on EP regression and previous literature on examples of impropriety of posterior distributions for distinct objective Bayes priors (Berger et al. 2001; Ferreira and De Oliveira 2007; Salazar et al. 2012; Wasserman 2000).
We have considered the frequentist properties of the proposed Bayesian approaches via a simulation study. In particular, we have shown that credible intervals based on , , and have similar frequentist properties with coverage close to nominal for p and σ. This is a reflection of the fact that for any prior satisfying some regularity conditions the frequentist coverage of credible intervals and the nominal level agree up to O(n−1/2) (for a discussion and conditions, see Ghosh et al. 2006). A prior that leads to a more stringent agreement of order O(n−1) is called a first-order probability matching prior. Such priors have to be derived with a specific parameter of interest in mind, and their derivation is far from trivial. Therefore, promising directions for future research for the EP regression model would be the derivation of priors that lead to Bayesian predictions that have approximate frequentist validity (Datta et al. 2000b) and the derivation of first-order probability matching priors (Datta and Ghosh 1995; Datta et al. 2000a).
Proof of Theorem 1. To prove Theorem 1, we follow the methodology to obtain reference priors proposed by Berger and Bernardo (1992a). In particular, we assume that the reader is familiar with both the notation and the methodology of Berger and Bernardo (1992a). This proof is divided in two parts. In the first part, we obtain the reference prior for the orderings (β,σ,p), (σ,β,p), and (σ,p,β). Because the proofs are analogous for each of these three orderings, in the first part we obtain the reference prior for the ordering (σ,β,p). In the second part, we obtain the reference prior for the orderings (β,p,σ), (p,β,σ), and (p,σ,β). Because the proofs are analogous for each of these three orderings, in the second part we obtain the reference prior for the ordering (p,β,σ).
Part 1. Consider the ordering θ=(σ,β,p).
Let θ(1)=σ, θ(2)=β, and θ(3)=p. In addition, let θ=θ(1)=σ, θ=(θ(1),θ(2))=(σ,β), and θ=(θ(1),θ(2),θ(3))=(σ,β,p). Moreover, let θ[∼1]=(θ(2),θ(3))=(β,p) and θ[∼2]=(θ(3))=p. Further, consider the following compact sets: for σ, ; for β, ; for p, .
does not depend on θ=(σ,β,p).
which is of the form (4).
Part 2. Consider the ordering θ=(p,β,σ).
Let θ(1)=p, θ(2)=β, and θ(3)=σ. In addition, let θ=θ(1)=p, θ=(θ(1),θ(2))=(p,β), and θ=(θ(1),θ(2),θ(3))=(p,β,σ). Moreover, let θ[∼1]=(θ(2),θ(3))=(β,σ) and θ[∼2]=(θ(3))=σ. Further, consider the following compact sets: for p, ; for β, ; for σ, .
which is of the form (4).
The work of Ferreira was supported in part by National Science Foundation Grant DMS-0907064. The authors gratefully acknowledge the constructive comments and suggestions made by three anonymous referees that led to a substantially improved article.
- Achcar JA, Pereira GA: Use of exponential power distributions for mixture models in the presence of covariates. J. Appl. Stat 26(6):669–679. 1999MathSciNetGoogle Scholar
- Berger JO, Bernardo JM: On the development of the reference prior method. In Bayesian Statistics 4. Edited by: Bernardo JM, Berger JO, Dawid AP, Smith AFM. London: Oxford University Press; 1992aGoogle Scholar
- Bernardo JM, Berger, JO: Ordered group reference priors with applications to a multinomial problem. Biometrika 79: 25–37. 1992bMathSciNetView ArticleGoogle Scholar
- Berger JO, Bernardo JM: Reference priors in a variance components problem. In Bayesian Analysis in Statistics and Econometrics. Edited by: Goel PK, Iyengar NS. Berlin: Springer; 1992cGoogle Scholar
- Berger JO, de Oliveira V, Sansó B: Objective Bayesian analysis of spatially correlated data. J. Am. Stat. Assoc 96(456):1361–1374. 2001MathSciNetView ArticleGoogle Scholar
- Bernardo JM: Reference posterior distribution for Bayes inference. J. Roy. Stat. Soc. B 41: 113–147. 1979MathSciNetGoogle Scholar
- Bernardo JM, Smith AFM: Bayesian Theory. Wiley, New York; 1994View ArticleGoogle Scholar
- Box GEP, Tiao GC: A further look at robustness via Bayes’s theorem. Biometrika 49: 419–432. 1962MathSciNetView ArticleGoogle Scholar
- Tiao GC, Box, GEP: Bayesian Inference in Statistical Analysis. Wiley-Interscience, Hoboken; 1992View ArticleGoogle Scholar
- Choy STB, Smith AFM: On robust analysis of a normal location parameter. J. Roy. Stat. Soc. B 59(2):463–474. 1997MathSciNetView ArticleGoogle Scholar
- Cribari-Neto F, Ferrari SLP, Cordeiro GM: Improved heteroscedasticity-consistent covariance matrix estimators. Biometrika 87: 907–918. 2000MathSciNetView ArticleGoogle Scholar
- Datta GS, Ghosh JK: Noninformative priors for maximal invariant parameter in group models. Test 4: 95–114. 1995MathSciNetView ArticleGoogle Scholar
- Datta GS, Ghosh M, Mukerjee R: Some new results on probability matching priors. Bull. Calcutta Stat. Assoc 50(199–200):179–192. 2000aGoogle Scholar
- Datta GS, Mukerjee R, Ghosh M, Sweeting TJ: Bayesian prediction with approximate frequentist validity. Ann. Stat 28: 1414–1426. 2000bMathSciNetView ArticleGoogle Scholar
- Ferreira MAR, De Oliveira V: Bayesian reference analysis for Gaussian Markov Random Fields. J. Multivariate Anal 98: 789–812.MathSciNetView ArticleGoogle Scholar
- Ferreira MAR, Suchard MA: Bayesian analysis of elapsed times in continuous-time Markov chains. Can. J. Stat 36: 355–368. 2008MathSciNetView ArticleGoogle Scholar
- Fonseca TCO, Ferreira MAR, Migon HS: Objective Bayesian analysis for the Student- t regression model. Biometrika 95(2):325–333. 2008MathSciNetGoogle Scholar
- Greene WH: Econometric Analysis. Prentice-Hall, Upper Saddle River; 1997Google Scholar
- Ghosh JK, Delampady M, Samanta T: An Introduction to Bayesian Statistics – Theory and Methods. Springer, New York; 2006Google Scholar
- Jeffreys H: Theory of Probability. Oxford University Press, Oxford; 1961Google Scholar
- Levine DM, Krehbiel TC, Berenson ML: Business Statistics: A First Course. Pearson Prentice Hall, Upper Saddle River; 2006Google Scholar
- Liang F, Liu C, Wang N: A robust sequential Bayesian method for identification of differentially expressed genes. Statistica Sinica 17: 571–597. 2007MathSciNetGoogle Scholar
- Salazar E, Ferreira MAR, Migon HS: Objective Bayesian analysis for exponential power regression models. Sankhya - Series B 74: 107–125. 2012MathSciNetView ArticleGoogle Scholar
- Sun D, Berger JO: Objective Bayesian analysis for the multivariate normal model. In Bayesian Statistics 8. Edited by: Bernardo JM, Bayarri MJ, Berger JO, Dawid AP, Heckerman D, Smith AFM, West M. Oxford: Oxford University Press; 2007Google Scholar
- Vianelli S: La misura della variabilità condizionata in uno schema generale delle curve normali di frequenza. Statistica 23: 447–474. 1963Google Scholar
- Walker SG, Gutiérrez-Peña E: Robustifying Bayesian procedures. In Bayesian Statistics 6. New York: Oxford University Press; 1999Google Scholar
- Wasserman L: Asymptotic inference for mixture models using data-dependent priors. J. Roy. Stat. Soc. B 62: 159–180. 2000MathSciNetView ArticleGoogle Scholar
- West M: Outlier models and prior distributions in Bayesian linear regression. J. Roy. Stat. Soc. B 46: 431–439. 1984MathSciNetGoogle Scholar
- West, M: On scale mixtures of normal distributions. Biometrika 79: 646–648. 1987MathSciNetView ArticleGoogle Scholar
- Zhu D, Zinde-Walsh V: Properties and estimation of asymmetric exponential power distribution. J. Econometrics 148: 86–99. 2009MathSciNetView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.