Simple robust parameter estimation for the Birnbaum-Saunders distribution
- Min Wang^{1}Email author,
- Chanseok Park^{2} and
- Xiaoqian Sun^{3}
https://doi.org/10.1186/s40488-015-0038-4
© Wang et al. 2015
Received: 3 September 2015
Accepted: 16 November 2015
Published: 1 December 2015
Abstract
We study the problem of robust estimation for the two-parameter Birnbaum-Saunders distribution. It is well known that the maximum likelihood estimator (MLE) is efficient when the underlying model is true but at the same time it is quite sensitive to data contamination that is often encountered in practice. In this paper, we propose several estimators which have simple closed forms and are also robust to data contamination. We study the breakdown points and asymptotic properties of the proposed estimators. These estimators are then applied to both simulated and real datasets. Numerical results show that the proposed estimators are attractive alternative to the MLE in that they are quite robust to data contamination and also highly efficient when the underlying model is true.
Keywords
Mathematics subject classification
Introduction
where Φ(·) is the cdf of the standard normal distribution, and the parameters α and β are the shape and scale parameters, respectively. This distribution has been widely applied to a variety of quality and reliability engineering problems. For example, Bhattacharyya and Fries (1982) developed fatigue failure models and also discussed the intrinsic relation between the inverse Gaussian distribution and the BS distribution. Using the inverse Gaussian approximation to the BS distribution, Park and Padgett (2006) developed various new cumulative damage models and degradation models. Lio and Park (2008) developed a bootstrap control chart based on the BS distribution.
Estimation of the BS parameters is of great interest to researchers and has recently received much attention in the literature. Birnbaum and Saunders (1969b) firstly discussed the MLEs of α and β. Engelhardt et al. (1981) derived the asymptotic joint distribution of the MLEs and showed that the MLEs are asymptotically independent. Achcar (1993) studied the Bayesian estimators based on Jeffreys’ prior and the reference prior and adopted Laplace’s approximations to the posterior marginal distribution of the parameters of interest. Ng et al. (2003) proposed the method-of-moment estimator (MME) of the two parameters. All the above mentioned approaches, except for the MME, have no explicit closed-form solutions, so some numerical approaches are often required to obtain these estimators.
The quality of data is extremely important for parameter estimation and complete data without any contamination is always preferred to achieve a high accuracy on parameter estimation. Unfortunately, in the engineering sciences, there is no guarantee that the collected data exactly follow the assumed model. In other words, reliability engineers often face data with deviations from the assumed model especially when studying failure data. Note that the commonly used estimators such as the MLEs are very sensitive to data contamination and model departure that are often encountered in many practical situations. Small deviations may induce a large impact on these estimators and a single outlier can even make them break down. This phenomenon motivates many authors to study robust estimation for various distributions; see, for example, Agostinelli et al. (2014), Boudt et al. (2011), Lawson et al. (1997), among others. Of particular note is that many researchers mainly focus on robust estimation for the Weibull distribution and that robust estimation for the BS distribution is quite scant, even though the BS distribution is prevalent in the engineering sciences as an effective model of studying fatigue data.
It deserves mentioning that Dupuis and Mills (1998) developed the robust estimation procedures based on an optimal bias-robust estimator (OBRE) of the BS distribution, whereas the OBRE is not in explicit form and it may suffer from the problem of convergence. These observations give rise to the need for some alternative estimators, which should be easy to calculate for practitioners and are also quite robust against a certain amount of data contamination. Recently, Wang et al. (2013) developed a new method for robustly estimating the parameters of the BS distribution, whereas their method is somewhat complicated for practitioners.
In this paper, we propose several alternative estimators for the BS distribution and study the asymptotic properties of the proposed estimators and their breakdown points. These estimators are explicit closed-form expressions of the sample observations and are thus easy to calculate without involving any computational complexity. Several of the proposed estimators have a very high breakdown point. Here, the breakdown point, a criterion often used to measure robustness of an estimator, is defined as the proportion of incorrect observations (i.e., arbitrarily small or large observations), the estimators of α and β can handle before giving estimated values arbitrary close to zero or infinity. As an illustration, the median has a breakdown point of 50 %, whereas the mean has a breakdown point of 0 %. The results of numerical studies show that the proposed estimators are attractive alternatives to the MLE in that they are quite robust to data contamination and also highly efficient when the underlying model is true.
The remainder of this paper is organized as follows. In Section 2, we firstly present a summary of the BS distribution and the MLEs, and then provide several alternative estimators in explicit closed-form expressions. We also investigate the asymptotic properties of these estimators along with their breakdown points. Section 3 contains an extensive Monte Carlo simulation to compare the behavior of the proposed estimators and that of the MLEs. In Section 4, we illustrate the practical application of all the estimators under consideration through using a real dataset. Finally, some concluding remarks and discussions are given in Section 5.
Parameter estimation
It is well known that this distribution has many attractive properties. (i) The scale parameter β is the median of the distribution, that is, F(β)=Φ(0)=0.5; (ii) it is positively skewed, with degree of skewness decreasing with α; (iii) for any constant κ>0, it follows that κ T∼BS(α, κ β); (iv) its reciprocal property (Saunders 1974) holds, that is, T ^{−1}∼BS(α, β ^{−1}); and (v) if we make the transformation T=β[1+2X ^{2}+2X(1+X ^{2})^{1/2}], then X is normally distributed with mean zero and variance α ^{2}/4.
Because the uniqueness of the solution for Eq. (2) is guaranteed in the interval (r, s), a one-dimensional root search is adopted in this paper to compute the MLEs instead of their methods; see, for example, Lio and Park (2008).
Although the MLEs enjoy several optimal properties, they have no explicit expressions and are highly sensitive to model departure that often occurs in practice. These observations motivate us to develop alternative estimators. Specifically, we propose several alternative estimators in the following two subsections. One is dedicated to developing two estimators of β; the other is to develop two estimators of α. The proposed estimators are explicit closed-form expressions in terms of the sample observations and can thus be easily computed in practical situations. Furthermore, it can be shown that some of them are asymptotically normally distributed and have the highest breakdown point of 50 %.
The estimators of the parameter β
This estimator also has a breakdown point of 29 %. Note that the estimator \(\hat \beta ^{\text {HL}}\) approximately follows the log-normal distribution and that the estimator \(\hat \beta ^{\text {HL}}\) is biased, but consistent for β.
In what follows, we will adopt the above two estimators of β to develop the estimator of α due to their simplicity and computational ease as well as high breakdown points.
The estimators for the parameter α
Here, we proposed two robust estimators for the shape parameter α.
Monte Carlo simulations
To illustrate the proposed method, we consider Monte Carlo simulation studies with one without contamination and the other involving different kinds of contaminations.
Numerical results without contamination
We carry out Monte Carlo simulations to compare the performances of the estimators under consideration. We take the sample size n=10,50,100, and the shape parameter α=0.5,1.0,2.0. Since β is the scale parameter, its value was kept fixed at β=1.0, without loss of any generality.
- (i)The average bias and RMSE of all the estimators significantly decrease as n increases. As expected, for large sample sizes, the performance of the proposed estimators and that of the MLEs are very close in terms of the average bias and RMSE.Table 1
Average bias and RMSE (in parentheses) of estimates for α and β
Estimator of α
Estimator of β
n
α
\(\hat \alpha ^{\text {MLE}}\)
\(\hat \alpha ^{\text {IQR}}_{1}\)
\(\hat \alpha ^{\text {RC}}_{1}\)
\(\hat \beta ^{\text {MLE}}\)
\(\hat \beta ^{\mathrm {M}}\)
10
0.5
−0.039
−0.065
−0.000
−0.012
−0.022
(0.117)
(0.172)
(0.150)
(0.156)
(0.192)
1.0
−0.087
−0.136
−0.017
−0.040
−0.080
(0.235)
(0.343)
(0.298)
(0.299)
(0.412)
2.0
−0.186
−0.273
−0.088
−0.105
−0.331
(0.481)
(0.726)
(0.597)
(0.534)
(1.065)
50
0.5
−0.007
−0.015
−0.000
−0.002
−0.003
(0.051)
(0.081)
(0.059)
(0.068)
(0.086)
1.0
−0.017
−0.029
−0.003
−0.008
−0.017
(0.101)
(0.163)
(0.116)
(0.128)
(0.179)
2.0
−0.035
−0.060
−0.021
−0.018
−0.060
(0.206)
(0.334)
(0.238)
(0.195)
(0.379)
100
0.5
−0.004
−0.007
−0.001
−0.001
−0.002
(0.036)
(0.058)
(0.041)
(0.048)
(0.063)
1.0
−0.008
−0.014
−0.001
−0.004
−0.007
(0.070)
(0.116)
(0.080)
(0.088)
(0.124)
2.0
−0.019
−0.028
−0.011
−0.010
−0.033
(0.144)
(0.234)
(0.163)
(0.135)
(0.260)
Table 2Average bias and RMSE (in parentheses) of estimates for α and β
Estimator of α
Estimator of β
n
α
\(\hat \alpha ^{\text {MLE}}\)
\(\hat \alpha ^{\text {IQR}}_{2}\)
\(\hat \alpha ^{\text {RC}}_{2}\)
\(\hat \beta ^{\text {MLE}}\)
\(\hat \beta ^{\text {HL}}\)
10
0.5
−0.039
−0.066
−0.000
−0.012
−0.014
(0.117)
(0.171)
(0.149)
(0.156)
(0.164)
1.0
−0.087
−0.141
−0.017
−0.040
−0.048
(0.235)
(0.340)
(0.297)
(0.299)
(0.332)
2.0
−0.186
−0.314
−0.078
−0.105
−0.165
(0.481)
(0.700)
(0.595)
(0.534)
(0.714)
50
0.5
−0.007
−0.015
−0.000
−0.002
−0.002
(0.051)
(0.081)
(0.059)
(0.068)
(0.071)
1.0
−0.017
−0.031
−0.003
−0.008
−0.010
(0.101)
(0.162)
(0.116)
(0.128)
(0.139)
2.0
−0.035
−0.071
−0.018
−0.018
−0.029
(0.206)
(0.330)
(0.238)
(0.195)
(0.256)
100
0.5
−0.004
−0.004
−0.000
−0.001
−0.001
(0.036)
(0.058)
(0.041)
(0.048)
(0.051)
1.0
−0.008
−0.015
−0.001
−0.004
−0.004
(0.070)
(0.115)
(0.080)
(0.088)
(0.096)
2.0
−0.019
−0.034
−0.009
−0.010
−0.016
(0.144)
(0.233)
(0.163)
(0.135)
(0.176)
- (ii)
The RC estimator \(\hat \alpha ^{\text {RC}}_{j}\) for j=1,2 outperforms others in terms of the average bias for all the considered cases. Note also that estimation of α using the HL estimator \(\hat \beta ^{\text {HL}}\) is superior than the one using the median estimator \(\hat \beta ^{\mathrm {M}}\) in terms of the RMSE.
- (iii)
For the cases without contamination, the estimator \(\hat \beta ^{\text {MLE}}\) performs the best as expected. We observe that the performance of the HL estimator \(\hat \beta ^{\text {HL}}\) is better than that of the median estimator \(\hat \beta ^{\mathrm {M}}\). All the estimators of β become closer together as n increases.
Numerical results with contamination
- Model 1
A model with no contamination.
- Model 2
A model with 5 % of severe contamination; the upper 5 % of order statistics are multiplied by 5.
- Model 3
A model with 5 % of severe contamination; the lower 5 % of order statistics are multiplied by 1/5.
- Model 4
A model with 5 % of more extreme contamination from a point mass distribution at 50.
Average bias and RMSE (in parentheses) of estimates for α and β with n=100 under the four models
Estimator of α | Estimator of β | ||||||
---|---|---|---|---|---|---|---|
\(\hat \alpha ^{\text {MLE}}\) | \(\hat \alpha ^{\text {IQR}}_{1}\) | \(\hat \alpha ^{\text {RC}}_{1}\) | \(\hat \alpha ^{\text {IQR}}_{2}\) | \(\hat \alpha ^{\text {RC}}_{2}\) | \(\hat \beta ^{\text {MLE}}\) | \(\tilde \beta ^{\mathrm {M}}\) | \(\hat \beta ^{\text {HL}}\) |
Model 1 | |||||||
−0.0038 | −0.0068 | −0.0001 | −0.0040 | −0.0001 | 0.0013 | 0.0024 | 0.0014 |
(0.0353) | (0.0583) | (0.0405) | (0.0445) | (0.0405) | (0.0483) | (0.0625) | (0.0504) |
Model 2 | |||||||
0.3424 | −0.0068 | 0.0075 | 0.9484 | 0.0075 | 0.2512 | 0.0024 | 0.0040 |
(0.3452) | (0.9570) | (0.3742) | (0.0583) | (0.0426) | (0.2593) | (0.0625) | (0.0511) |
Model 3 | |||||||
0.3423 | −0.0068 | 0.0075 | 0.0139 | 0.0075 | −0.1983 | 0.0024 | −0.0011 |
(0.3451) | (0.0583) | (0.0426) | (0.0448) | (0.0426) | (0.2026) | (0.0625) | (0.0507) |
Model 4 | |||||||
0.8931 | 0.0253 | 0.0523 | 2.4594 | 0.0523 | 1.0944 | 0.0364 | 0.0501 |
(0.8936) | (0.0659) | (0.0693) | (2.4607) | (0.0693) | (1.0957) | (0.0751) | (0.0736) |
- (i)
For Model 1, that is, there is no contamination, the RC estimator \(\hat \alpha ^{\text {RC}}_{j}\) for j=1,2 performs the best for estimating α in terms of the average bias; the MLE \(\hat \beta \) is the best one for estimating β, but the HL estimator \(\hat \beta ^{\text {HL}}\) behaves much similarly.
- (ii)
For Models 2, 3, and 4, that is, contamination presents in the dataset, we observe that contamination induces a large influence on the average bias and RMSE of the non-robust estimators including the MLEs, especially in the presence of extreme outliers such as the fourth scenario, whereas it has a smaller impact on the proposed estimators.
- (iii)
For the scale parameter estimation, the HL estimator \(\hat \beta ^{\text {HL}}\) outperforms the median estimators in terms of the RMSE, whereas both are quite robust against data contamination.
Which of the two robust estimators, the HL estimator or the median estimator, is preferable for the parameter β in the analysis of real lifetime data? Numerical results show that for all the cases considered in this paper, the HL estimator \(\hat \beta ^{\text {HL}}\) outperforms the median estimator \(\hat \beta ^{\mathrm {M}}\) in term of the RMSE. Additionally, the estimator of α developed based on \(\hat \beta ^{\text {HL}}\) also slightly outperforms the one using \(\hat \beta ^{\mathrm {M}}\) in most cases. We thus have a preference to recommend the HL estimator for β. It should be mentioned that other simulation results with respect to several other values of the parameter α and different sample sizes have also been conducted, and the conclusions are quite similar and are thus not provided here for brevity.
An illustrative example
Fatigue lifetime data by Birnbaum and Saunders (1969b)
70 | 90 | 96 | 97 | 99 | 100 | 103 | 104 | 104 | 105 | 107 | 108 | 108 | 108 | 109 | 109 | 112 |
112 | 113 | 114 | 114 | 114 | 116 | 119 | 120 | 120 | 120 | 121 | 121 | 123 | 124 | 124 | 124 | 124 |
124 | 128 | 128 | 129 | 129 | 130 | 130 | 130 | 131 | 131 | 131 | 131 | 131 | 132 | 132 | 132 | 133 |
134 | 134 | 134 | 134 | 134 | 136 | 136 | 137 | 138 | 138 | 138 | 139 | 139 | 141 | 141 | 142 | 142 |
142 | 142 | 142 | 142 | 144 | 144 | 145 | 146 | 148 | 148 | 149 | 151 | 151 | 152 | 155 | 156 | 157 |
157 | 157 | 157 | 158 | 159 | 162 | 163 | 163 | 164 | 166 | 166 | 168 | 170 | 174 | 196 | 212 |
Comparison between the developed estimators and the MLE through fatigue lifetime data by Birnbaum and Saunders (1969b)
Table 4 data | Misrecorded data | |||
---|---|---|---|---|
Method | α | β | α | β |
\((\hat \alpha ^{\text {MLE}}, ~\hat \beta ^{\text {MLE}})\) | 0.1704 | 131.8188 | 0.2415 | 134.7689 |
\((\hat \alpha ^{\text {IQR}}_{1}, ~\hat \beta ^{\mathrm {M}})\) | 0.1454 | 133.0000 | 0.1555 | 134.0000 |
\((\hat \alpha ^{\text {RC}}_{1}, ~\hat \beta ^{\mathrm {M}})\) | 0.1601 | 133.0000 | 0.1677 | 134.0000 |
\((\hat \alpha ^{\text {IQR}}_{2}, ~\hat \beta ^{\text {HL}})\) | 0.1454 | 132.6047 | 0.1555 | 132.8834 |
\((\hat \alpha ^{\text {RC}}_{2}, ~\hat \beta ^{\text {HL}})\) | 0.1601 | 132.6047 | 0.1677 | 132.8834 |
To evaluate robustness of the proposed methods, we follow the same scenario by Dupuis and Mills (1998) and assume that the 51st observation t _{51} was misrecorded as 633, instead of 133. It is desirable that the estimated shape and scale parameters should be very similar under the two scenarios, because we already know that the observation t _{51} is a recording error. However, it has been observed from Table 5 that the MLEs are heavily distorted by this single outlier and resulted in \(\hat \alpha ^{\text {MLE}} = 0.2415\) and \(\hat \beta ^{\text {MLE}} = 134.7689\), far from the MLEs with t _{51}=133, whereas the proposed robust estimators \(\left (\hat \alpha ^{\text {RC}}_{2}, ~\hat \beta ^{\text {HL}}\right)\) and \(\left (\hat \alpha ^{\text {IQR}}_{2}, ~\hat \beta ^{\text {HL}}\right)\) still provided more reasonable results, which are quite close to the estimated values with t _{51}=133.
Concluding remarks
In this paper, we have developed the two families of the estimators for the BS distribution, which are quite robust to data contamination. Unlike the MLEs, these estimators have simple closed-form expressions with higher breakdown points. For estimation of β, we have a preference for the use of the HL estimator \(\hat \beta ^{\text {HL}}\), because numerical results show that it remains more accurate than the median estimator \(\hat \beta ^{\mathrm {M}}\). Of all the considered estimators for α using the estimator \(\hat \beta ^{\text {HL}}\), we recommend the RC estimator \(\hat \alpha ^{\text {RC}}_{2}\), since it has a good trade-off between efficiency and robustness. It deserves to be mentioned that other proposed estimators of α are also attractive alternatives to the MLE in that they are highly efficient when the underlying model is true.
In summary, we have a preference for the RC and HL estimators \(\left (\hat \alpha ^{\text {RC}}_{2}, ~\hat \beta ^{\text {HL}}\right)\) for estimating (α, β), because it has been shown to be simple, very effective, and quite robust against model departure that often occurs in many practical situations. Note that censored data occur commonly in the field data from reliability tests, so a possible extension of the proposed estimators for the censored data will be investigated in the future.
Declarations
Acknowledgements
The authors thank the Editor Carl Lee and the two anonymous reviewers for their comments which have improved the appearance of this paper.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- Achcar, JA: Inferences for the Birnbaum-Saunders fatigue life model using Bayesian methods. Comput. Statist. Data Anal. 15, 367–380 (1993).MATHMathSciNetView ArticleGoogle Scholar
- Agostinelli, C, Marazzi, A, Yohai, VJ: Robust estimators of the generalized loggamma distribution. Technometrics. 56, 92–101 (2014).MathSciNetView ArticleGoogle Scholar
- Bhattacharyya, G, Fries, A: Fatigue failure models − birnbaum-saunders vs. inverse Gaussian. Reliability IEEE Trans. 31, 439–441 (1982).MATHView ArticleGoogle Scholar
- Bickel, PJ, Lehmann, EL: Descriptive statistics for non-parametric models III: Dispersion. Ann. Statist. 4, 1139–1158 (1976).MATHMathSciNetView ArticleGoogle Scholar
- Birnbaum, ZW, Saunders, SC: A new family of life distributions. J. Appl. Probability. 6, 319–327 (1969a).Google Scholar
- Birnbaum, ZW, Saunders, SC: Estimation for a family of life distributions with applications to fatigue. J. Appl. Probability. 6, 328–347 (1969b).Google Scholar
- Boudt, K, Caliskan, D, Croux, C: Robust explicit estimators of Weibull parameters. Metrika. 73, 187–209 (2011).MATHMathSciNetView ArticleGoogle Scholar
- Dupuis, D, Mills, J: Robust estimation of the Birnbaum-Saunders distribution. IEEE Trans. Reliab. 47, 88–95 (1998).View ArticleGoogle Scholar
- Engelhardt, M, Bain, LJ, Wright, FT: Inferences on the parameters of the Birnbaum-Saunders fatigue life distribution based on maximum likelihood estimation. Technometrics. 23, 251–256 (1981).MATHMathSciNetView ArticleGoogle Scholar
- Ghosh, JK: A new proof of the Bahadur representation of quantiles and an application. Ann. Math. Statist. 42, 1957–1961 (1971).MATHMathSciNetView ArticleGoogle Scholar
- Hodges, JL Jr, Lehmann, EL: Estimates of location based on rank tests. Ann. Math. Statist. 34, 598–611 (1963).MathSciNetView ArticleGoogle Scholar
- Lawson, C, Keats, J, Montgomery, D: Comparison of robust and least-squares regression in computer-generated probability plots. Reliability IEEE Trans. 46, 108–115 (1997).View ArticleGoogle Scholar
- Lio, YL, Park, C: A bootstrap control chart for Birnbaum-Saunders percentiles. Qual. Reliability Eng Int. 24, 585–600 (2008).View ArticleGoogle Scholar
- Ng, HKT, Kundu D, Balakrishnan, N: Modified moment estimation for the two-parameter Birnbaum-Saunders distribution. Comput. Statist. Data Anal. 43, 283–298 (2003).MATHMathSciNetView ArticleGoogle Scholar
- Park, C, Padgett, W: Stochastic degradation models with several accelerating variables. Reliability, IEEE Trans. 55, 379–390 (2006).View ArticleGoogle Scholar
- R Development Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2011). ISBN 3-900051-07-0.Google Scholar
- Rieck, JR, Nedelman, JR: A log-linear model for the Birnbaum-Saunders distribution. Technometrics. 33, 51–60 (1991).MATHGoogle Scholar
- Rousseeuw, PJ, Croux, C: Alternatives to the median absolute deviation. J. Amer. Statist. Assoc. 88, 1273–1283 (1993).MATHMathSciNetView ArticleGoogle Scholar
- Saunders, SC: A family of random variables closed under reciprocation. J. Amer. Statist. Assoc. 69, 533–539 (1974).MATHView ArticleGoogle Scholar
- Shamos, MI: Geometry and statistics: problems at the interface. In Algorithms and complexity (Proc. Sympos., Carnegie-Mellon Univ., Pittsburgh, Pa., 1976). Academic Press, New York (1976).Google Scholar
- Wang, M, Zhao, J, Sun, X, Park, C: Robust explicit estimation of the two-parameter Birnbaum-Saunders distribution. J Appl. Stat. 40, 2259–2274 (2013).MathSciNetView ArticleGoogle Scholar