Open Access

Simple robust parameter estimation for the Birnbaum-Saunders distribution

Journal of Statistical Distributions and Applications20152:14

DOI: 10.1186/s40488-015-0038-4

Received: 3 September 2015

Accepted: 16 November 2015

Published: 1 December 2015

Abstract

We study the problem of robust estimation for the two-parameter Birnbaum-Saunders distribution. It is well known that the maximum likelihood estimator (MLE) is efficient when the underlying model is true but at the same time it is quite sensitive to data contamination that is often encountered in practice. In this paper, we propose several estimators which have simple closed forms and are also robust to data contamination. We study the breakdown points and asymptotic properties of the proposed estimators. These estimators are then applied to both simulated and real datasets. Numerical results show that the proposed estimators are attractive alternative to the MLE in that they are quite robust to data contamination and also highly efficient when the underlying model is true.

Keywords

Hodges-Lehmann estimator Rousseeuw-Croux estimator Robustness Breakdown point Data contamination

Mathematics subject classification

62F10 62F12 62F35

Introduction

The two-parameter Birnbaum-Saunders (for short, BS) distribution was originally proposed by Birnbaum and Saunders (1969a) as a failure time distribution to describe the total time until the damage caused by the development and growth of a dominant crack grows to a critical level that would cause fracture of failure. The random variable T is said to follow the BS distribution with parameters α and β if its cumulative distribution function (cdf) is given by
$$\begin{array}{@{}rcl@{}} F(t \mid \alpha, \beta) = \Phi\left[\frac{1}{\alpha}\left(\sqrt{\frac{t}{\beta}} - \sqrt{\frac{\beta}{t}}\right)\right], \quad 0 < t < \infty, \quad \alpha, ~\beta >0, \end{array} $$
(1)

where Φ(·) is the cdf of the standard normal distribution, and the parameters α and β are the shape and scale parameters, respectively. This distribution has been widely applied to a variety of quality and reliability engineering problems. For example, Bhattacharyya and Fries (1982) developed fatigue failure models and also discussed the intrinsic relation between the inverse Gaussian distribution and the BS distribution. Using the inverse Gaussian approximation to the BS distribution, Park and Padgett (2006) developed various new cumulative damage models and degradation models. Lio and Park (2008) developed a bootstrap control chart based on the BS distribution.

Estimation of the BS parameters is of great interest to researchers and has recently received much attention in the literature. Birnbaum and Saunders (1969b) firstly discussed the MLEs of α and β. Engelhardt et al. (1981) derived the asymptotic joint distribution of the MLEs and showed that the MLEs are asymptotically independent. Achcar (1993) studied the Bayesian estimators based on Jeffreys’ prior and the reference prior and adopted Laplace’s approximations to the posterior marginal distribution of the parameters of interest. Ng et al. (2003) proposed the method-of-moment estimator (MME) of the two parameters. All the above mentioned approaches, except for the MME, have no explicit closed-form solutions, so some numerical approaches are often required to obtain these estimators.

The quality of data is extremely important for parameter estimation and complete data without any contamination is always preferred to achieve a high accuracy on parameter estimation. Unfortunately, in the engineering sciences, there is no guarantee that the collected data exactly follow the assumed model. In other words, reliability engineers often face data with deviations from the assumed model especially when studying failure data. Note that the commonly used estimators such as the MLEs are very sensitive to data contamination and model departure that are often encountered in many practical situations. Small deviations may induce a large impact on these estimators and a single outlier can even make them break down. This phenomenon motivates many authors to study robust estimation for various distributions; see, for example, Agostinelli et al. (2014), Boudt et al. (2011), Lawson et al. (1997), among others. Of particular note is that many researchers mainly focus on robust estimation for the Weibull distribution and that robust estimation for the BS distribution is quite scant, even though the BS distribution is prevalent in the engineering sciences as an effective model of studying fatigue data.

It deserves mentioning that Dupuis and Mills (1998) developed the robust estimation procedures based on an optimal bias-robust estimator (OBRE) of the BS distribution, whereas the OBRE is not in explicit form and it may suffer from the problem of convergence. These observations give rise to the need for some alternative estimators, which should be easy to calculate for practitioners and are also quite robust against a certain amount of data contamination. Recently, Wang et al. (2013) developed a new method for robustly estimating the parameters of the BS distribution, whereas their method is somewhat complicated for practitioners.

In this paper, we propose several alternative estimators for the BS distribution and study the asymptotic properties of the proposed estimators and their breakdown points. These estimators are explicit closed-form expressions of the sample observations and are thus easy to calculate without involving any computational complexity. Several of the proposed estimators have a very high breakdown point. Here, the breakdown point, a criterion often used to measure robustness of an estimator, is defined as the proportion of incorrect observations (i.e., arbitrarily small or large observations), the estimators of α and β can handle before giving estimated values arbitrary close to zero or infinity. As an illustration, the median has a breakdown point of 50 %, whereas the mean has a breakdown point of 0 %. The results of numerical studies show that the proposed estimators are attractive alternatives to the MLE in that they are quite robust to data contamination and also highly efficient when the underlying model is true.

The remainder of this paper is organized as follows. In Section 2, we firstly present a summary of the BS distribution and the MLEs, and then provide several alternative estimators in explicit closed-form expressions. We also investigate the asymptotic properties of these estimators along with their breakdown points. Section 3 contains an extensive Monte Carlo simulation to compare the behavior of the proposed estimators and that of the MLEs. In Section 4, we illustrate the practical application of all the estimators under consideration through using a real dataset. Finally, some concluding remarks and discussions are given in Section 5.

Parameter estimation

If a random variable T follows the BS distribution with the cdf in (1), then T has the probability density function (pdf) given by
$$f(t \mid \alpha, ~\beta) = \frac{1}{2\sqrt{2}\alpha\beta}\left[\left(\frac{\beta}{t}\right)^{1/2}+ \left(\frac{\beta}{t}\right)^{3/2}\right]\exp\left\{ -\frac{1}{2\alpha^{2}}\left(\frac{t}{\beta} + \frac{\beta}{t}-2\right)\right\}. $$

It is well known that this distribution has many attractive properties. (i) The scale parameter β is the median of the distribution, that is, F(β)=Φ(0)=0.5; (ii) it is positively skewed, with degree of skewness decreasing with α; (iii) for any constant κ>0, it follows that κ TBS(α, κ β); (iv) its reciprocal property (Saunders 1974) holds, that is, T −1BS(α, β −1); and (v) if we make the transformation T=β[1+2X 2+2X(1+X 2)1/2], then X is normally distributed with mean zero and variance α 2/4.

We firstly provide a brief review to the MLEs of the BS distribution. Let \(\mathcal {T} = \{t_{1}, t_{2}, \cdots, t_{n}\}\) be a random sample of size n from the BS distribution. The sample arithmetic and harmonic means are given by
$$s = \frac{1}{n}\sum_{i=1}^{n} t_{i} \quad \text{and} \quad r = \left[\frac{1}{n}\sum_{i=1}^{n} t_{i}^{-1}\right]^{-1}, $$
respectively. Let h(·) be the harmonic mean function given by
$$h(x) = \left[\frac{1}{n}\sum_{i=1}^{n} (x +t_{i})^{-1}\right]^{-1}. $$
Then the MLE of β, denoted by \(\hat \beta ^{\text {MLE}}\), can be obtained as the unique positive root of the following equation
$$\begin{array}{@{}rcl@{}} \beta^{2} - \beta\!\left[2r + h(\beta)\right] + r\left[s + h(\beta)\right] = 0. \end{array} $$
(2)
Once \(\hat \beta ^{\text {MLE}}\) is obtained, the MLE of α, denoted by \(\hat \alpha ^{\text {MLE}}\), is given by
$$\hat\alpha^{\text{MLE}} = \frac{s}{\hat \beta^{\text{MLE}}} + \frac{\hat \beta^{\text{MLE}}}{r} - 2. $$

Because the uniqueness of the solution for Eq. (2) is guaranteed in the interval (r, s), a one-dimensional root search is adopted in this paper to compute the MLEs instead of their methods; see, for example, Lio and Park (2008).

Although the MLEs enjoy several optimal properties, they have no explicit expressions and are highly sensitive to model departure that often occurs in practice. These observations motivate us to develop alternative estimators. Specifically, we propose several alternative estimators in the following two subsections. One is dedicated to developing two estimators of β; the other is to develop two estimators of α. The proposed estimators are explicit closed-form expressions in terms of the sample observations and can thus be easily computed in practical situations. Furthermore, it can be shown that some of them are asymptotically normally distributed and have the highest breakdown point of 50 %.

The estimators of the parameter β

Since the parameter β is the median of the BS distribution which satisfies F(β)=Φ(0)=0.5, the median is a natural choice as an estimator of β. We denote the sample median estimator of β by
$$\hat\beta^{\mathrm{M}} = \text{median}\left\{t_{1}, t_{2}, \cdots, t_{n}\right\}. $$
The above median estimator is quite robust and has the highest breakdown point of 50 %. From the asymptotic normality of the median estimator, it can be shown that
$$\sqrt{n}\left(\hat\beta^{\mathrm{M}} - \beta\right) \stackrel{D}{\longrightarrow} N \left(0, ~\frac{\pi (\alpha\beta)^{2}}{2}\right). $$
Rieck and Nedelman (1991) investigated the relationship between the BS distribution and the sinh-normal distribution and showed that if a random variable TBS(α, β), then Y= log(T) has the sinh-normal distribution with the cdf and pdf given by
$$ F(y \mid \mu, \alpha) = \Phi\left[\frac{2}{\alpha}\sinh\left(\frac{y-\mu}{2}\right)\right] $$
(3)
and
$$ f(y\mid \mu, \alpha) = \frac{1}{\alpha\sqrt{2\pi}}\cosh\left(\frac{y-\mu}{2}\right)\exp\left[-\frac{2}{\alpha^{2}}\sinh\left(\frac{y-\mu}{2}\right)\right], $$
(4)
respectively. This distribution has a shape parameter α and a location parameter μ= log(β), which is just the natural logarithm of the scale parameter of the BS distribution. One appealing advantage of using the above relationship is that we transform the asymmetric BS distribution to the symmetric sinh-normal distribution. It is well known that in the location framework, the Hodges and Lehmann (1963) estimator (for short, HL) of (Hodges and Lehmann 1963) has greater efficiency than does the sample median for symmetric distributions. Thus, it appears to be more beneficial to adopt the HL estimator for the parameter μ under the sinh-normal distribution, instead of the sample median of the original BS distribution. The HL estimator is defined as the median of n(n−1)/2 pairwise averages of observations and can be written as
$$\begin{array}{@{}rcl@{}} \tilde \mu = \text{median}_{i < j}\left\{\frac{y_{i} + y_{j}}{2}\right\}, \end{array} $$
(5)
which can easily be calculated using the hl.loc (·) function in the package of ICSNP in R language; see R Development Core Team (2011). Also, this estimator has 29 % breakdown point, which means that it remains consistent even if about 29 % percent of the data have been contaminated. In addition, as stated by Rousseeuw and Croux (1993), ‘the Hodges-Lehmann estimator might be viewed as a “smooth” version of the median’. It can be shown that this estimator is asymptotically normally distributed, namely,
$$\sqrt{n}\left(\tilde \mu ~-~ \mu\right) \stackrel{D}{\longrightarrow} N\left(0,~~\frac{1}{12\int_{0}^{\infty} f(x)^{2}\,dx}\right), $$
where f(x) is the pdf of the sinh-normal distribution in (4). Also, it is immediate from the above that the estimator \(\tilde \mu \) converges in probability to μ= log(β), that is, \(\tilde \mu \stackrel {p}{\longrightarrow } \mu \).
Based on the HL estimator in (5), we obtain the back-transformed estimator of the BS parameter β, denoted by \(\hat \beta ^{\text {HL}}\), which is simply given by
$$\begin{array}{@{}rcl@{}} \hat\beta^{\text{HL}} = \exp(\tilde \mu) = \exp\left[\text{median}_{i < j}\left(\frac{y_{i} + y_{j}}{2}\right)\right]. \end{array} $$
(6)

This estimator also has a breakdown point of 29 %. Note that the estimator \(\hat \beta ^{\text {HL}}\) approximately follows the log-normal distribution and that the estimator \(\hat \beta ^{\text {HL}}\) is biased, but consistent for β.

In what follows, we will adopt the above two estimators of β to develop the estimator of α due to their simplicity and computational ease as well as high breakdown points.

The estimators for the parameter α

Here, we proposed two robust estimators for the shape parameter α.

Let {t 1,t 2,,t n } be a random sample from the BS distribution with the cdf in (1). Note that Φ −1(0.75)−Φ −1(0.25)=1.34898 is the distance between the two quartiles of the standard normal distribution, the interquartile range (IQR). The sample IQR is equal to the difference between the third sample quartile (Q3) and the first sample quartile (Q1). To have a normal consistency, we consider the following estimator of α given by
$$\hat\alpha^{\text{IQR}}_{j} = \frac{\text{IQR}(y_{j})}{1.34898}, \quad j =1, 2, $$
The estimator \(\hat \alpha ^{\text {IQR}}_{j}\) is in simple closed-form with a higher breakdown point of 25 %, indicating that it remains consistent even if about 25 % percent of the data have been contaminated. Furthermore, by using the Bahadur’s representation theorem (Ghosh 1971), we have
$$\sqrt{n}\left(\hat\alpha^{\text{IQR}}_{j} ~-~ \alpha \right) \stackrel{D}{\longrightarrow} N\left(0, ~~c\alpha^{2}\right), \quad j =1, 2, $$
where c is a constant approximately equal to 2.48/1.3492≈1.363. The quartile estimator \(\hat \alpha ^{\text {IQR}}_{j}\) converges in probability to the parameter α, that is, \(\hat \alpha ^{\text {IQR}}_{j} ~\stackrel {p}{\longrightarrow }~\alpha \) for j=1,2.
We observe from Eq. (3) that by letting Y= log(T), we change the estimation problem of the shape and scale parameters to that of the locationand scale parameters. We advocate the use of (reference) estimator (shortly RC) for the estimator of α. The RC estimator is given by
$$ \hat\alpha^{\text{RC}}_{j} = b~ \left\{| \log t_{i} - \log t_{j}|; ~i<j\right\}_{(k)}, $$
where b is a constant factor equal to \(1/(\sqrt {2}\Phi ^{-1}(5/8)) \approx 2.2219\) to achieve consistency for standard deviation of α in the case of the normal distribution above, and \(k = {h \choose 2} \approx {n \choose 2} /4\) with h= [ n/2]+ 1 being roughly half the number of observations. The RC estimator is an explicit expression and can be easily calculated using the Q n (·) function in the package of robustbase in R language. In addition, it has the highest breakdown point of 50 % and a very high normal efficiency about 82 % under the normal distribution. An analogous scale estimator, called the Shamos estimator proposed by Shamos (1976) and studied by Bickel and Lehmann (1976), can also be adopted for α. However, the Shamos estimator has only a 29 % breakdown point and 86 % efficiency which is slightly higher than the RC estimator. Readers interested in full details of the Shamos estimator should refer to these papers.

Monte Carlo simulations

To illustrate the proposed method, we consider Monte Carlo simulation studies with one without contamination and the other involving different kinds of contaminations.

Numerical results without contamination

We carry out Monte Carlo simulations to compare the performances of the estimators under consideration. We take the sample size n=10,50,100, and the shape parameter α=0.5,1.0,2.0. Since β is the scale parameter, its value was kept fixed at β=1.0, without loss of any generality.

Inverting the cdf of the BS distribution F(Tα,β)=p given by (3), we obtain the following inverse cdf of the form
$$T = F^{-1}(p) = \frac{1}{4}\left(\alpha\sqrt{\beta}\Phi^{-1}(p) + \sqrt{\alpha^{2}\beta\left(\Phi^{-1}(p)\right)^{2}+4\beta}\right)^{2}. $$
Hence, random numbers following the BS distribution can be generated by using a direct inverse method, that is, T=F −1(U) with UUniform(0, 1). Replicate each case 10,000 times. In each simulation, we compute the average biases of each estimator under consideration with its corresponding square root of the mean square error (RMSE) given by
$$\text{RMSE}_{\alpha} = \sqrt{\frac{1}{M}\sum_{i=1}^{M}\left(\hat\alpha_{i} - \alpha_{T}\right)^{2}} \quad \text{and} \quad \text{RMSE}_{\beta} = \sqrt{\frac{1}{M}\sum_{i=1}^{M}\left(\hat\beta_{i} - \beta_{T}\right)^{2}}. $$
The results based on \(\hat \beta ^{\mathrm {M}}\) and \(\hat \beta ^{\text {HL}}\) are presented in Tables 1 and 2, respectively. Several conclusions from the numerical results can be drawn as follows.
  1. (i)
    The average bias and RMSE of all the estimators significantly decrease as n increases. As expected, for large sample sizes, the performance of the proposed estimators and that of the MLEs are very close in terms of the average bias and RMSE.
    Table 1

    Average bias and RMSE (in parentheses) of estimates for α and β

      

    Estimator of α

    Estimator of β

    n

    α

    \(\hat \alpha ^{\text {MLE}}\)

    \(\hat \alpha ^{\text {IQR}}_{1}\)

    \(\hat \alpha ^{\text {RC}}_{1}\)

    \(\hat \beta ^{\text {MLE}}\)

    \(\hat \beta ^{\mathrm {M}}\)

    10

    0.5

    −0.039

    −0.065

    −0.000

    −0.012

    −0.022

      

    (0.117)

    (0.172)

    (0.150)

    (0.156)

    (0.192)

     

    1.0

    −0.087

    −0.136

    −0.017

    −0.040

    −0.080

      

    (0.235)

    (0.343)

    (0.298)

    (0.299)

    (0.412)

     

    2.0

    −0.186

    −0.273

    −0.088

    −0.105

    −0.331

      

    (0.481)

    (0.726)

    (0.597)

    (0.534)

    (1.065)

    50

    0.5

    −0.007

    −0.015

    −0.000

    −0.002

    −0.003

      

    (0.051)

    (0.081)

    (0.059)

    (0.068)

    (0.086)

     

    1.0

    −0.017

    −0.029

    −0.003

    −0.008

    −0.017

      

    (0.101)

    (0.163)

    (0.116)

    (0.128)

    (0.179)

     

    2.0

    −0.035

    −0.060

    −0.021

    −0.018

    −0.060

      

    (0.206)

    (0.334)

    (0.238)

    (0.195)

    (0.379)

    100

    0.5

    −0.004

    −0.007

    −0.001

    −0.001

    −0.002

      

    (0.036)

    (0.058)

    (0.041)

    (0.048)

    (0.063)

     

    1.0

    −0.008

    −0.014

    −0.001

    −0.004

    −0.007

      

    (0.070)

    (0.116)

    (0.080)

    (0.088)

    (0.124)

     

    2.0

    −0.019

    −0.028

    −0.011

    −0.010

    −0.033

      

    (0.144)

    (0.234)

    (0.163)

    (0.135)

    (0.260)

    Table 2

    Average bias and RMSE (in parentheses) of estimates for α and β

      

    Estimator of α

    Estimator of β

    n

    α

    \(\hat \alpha ^{\text {MLE}}\)

    \(\hat \alpha ^{\text {IQR}}_{2}\)

    \(\hat \alpha ^{\text {RC}}_{2}\)

    \(\hat \beta ^{\text {MLE}}\)

    \(\hat \beta ^{\text {HL}}\)

    10

    0.5

    −0.039

    −0.066

    −0.000

    −0.012

    −0.014

      

    (0.117)

    (0.171)

    (0.149)

    (0.156)

    (0.164)

     

    1.0

    −0.087

    −0.141

    −0.017

    −0.040

    −0.048

      

    (0.235)

    (0.340)

    (0.297)

    (0.299)

    (0.332)

     

    2.0

    −0.186

    −0.314

    −0.078

    −0.105

    −0.165

      

    (0.481)

    (0.700)

    (0.595)

    (0.534)

    (0.714)

    50

    0.5

    −0.007

    −0.015

    −0.000

    −0.002

    −0.002

      

    (0.051)

    (0.081)

    (0.059)

    (0.068)

    (0.071)

     

    1.0

    −0.017

    −0.031

    −0.003

    −0.008

    −0.010

      

    (0.101)

    (0.162)

    (0.116)

    (0.128)

    (0.139)

     

    2.0

    −0.035

    −0.071

    −0.018

    −0.018

    −0.029

      

    (0.206)

    (0.330)

    (0.238)

    (0.195)

    (0.256)

    100

    0.5

    −0.004

    −0.004

    −0.000

    −0.001

    −0.001

      

    (0.036)

    (0.058)

    (0.041)

    (0.048)

    (0.051)

     

    1.0

    −0.008

    −0.015

    −0.001

    −0.004

    −0.004

      

    (0.070)

    (0.115)

    (0.080)

    (0.088)

    (0.096)

     

    2.0

    −0.019

    −0.034

    −0.009

    −0.010

    −0.016

      

    (0.144)

    (0.233)

    (0.163)

    (0.135)

    (0.176)

     
  2. (ii)

    The RC estimator \(\hat \alpha ^{\text {RC}}_{j}\) for j=1,2 outperforms others in terms of the average bias for all the considered cases. Note also that estimation of α using the HL estimator \(\hat \beta ^{\text {HL}}\) is superior than the one using the median estimator \(\hat \beta ^{\mathrm {M}}\) in terms of the RMSE.

     
  3. (iii)

    For the cases without contamination, the estimator \(\hat \beta ^{\text {MLE}}\) performs the best as expected. We observe that the performance of the HL estimator \(\hat \beta ^{\text {HL}}\) is better than that of the median estimator \(\hat \beta ^{\mathrm {M}}\). All the estimators of β become closer together as n increases.

     

Numerical results with contamination

In practice, there is no guarantee that the collected data exactly follow the BS distribution, especially when considering fatigue data. Specifically, data contamination may occur due to several reasons such as the measurement errors, departure from the true model, etc. Accordingly, it becomes necessary to investigate the performance of the estimators under a scenario in which data are contaminated with outliers. In a similar way as done by (Dupuis and Mills 1998), we compare the performances of the proposed estimators and that of the MLEs based on the following four possible models.
  1. Model 1

    A model with no contamination.

     
  2. Model 2

    A model with 5 % of severe contamination; the upper 5 % of order statistics are multiplied by 5.

     
  3. Model 3

    A model with 5 % of severe contamination; the lower 5 % of order statistics are multiplied by 1/5.

     
  4. Model 4

    A model with 5 % of more extreme contamination from a point mass distribution at 50.

     
The reference distribution is the BS distribution with the parameters α=0.5 and β=1. We generate 10,000 samples of size n=100 according to the above four scenarios and then calculate the average bias and RMSE of each estimator. The results are shown in Table 3. Some conclusions can be drawn as follows.
Table 3

Average bias and RMSE (in parentheses) of estimates for α and β with n=100 under the four models

 

Estimator of α

 

Estimator of β

\(\hat \alpha ^{\text {MLE}}\)

\(\hat \alpha ^{\text {IQR}}_{1}\)

\(\hat \alpha ^{\text {RC}}_{1}\)

\(\hat \alpha ^{\text {IQR}}_{2}\)

\(\hat \alpha ^{\text {RC}}_{2}\)

\(\hat \beta ^{\text {MLE}}\)

\(\tilde \beta ^{\mathrm {M}}\)

\(\hat \beta ^{\text {HL}}\)

Model 1

−0.0038

−0.0068

−0.0001

−0.0040

−0.0001

0.0013

0.0024

0.0014

(0.0353)

(0.0583)

(0.0405)

(0.0445)

(0.0405)

(0.0483)

(0.0625)

(0.0504)

Model 2

0.3424

−0.0068

0.0075

0.9484

0.0075

0.2512

0.0024

0.0040

(0.3452)

(0.9570)

(0.3742)

(0.0583)

(0.0426)

(0.2593)

(0.0625)

(0.0511)

Model 3

0.3423

−0.0068

0.0075

0.0139

0.0075

−0.1983

0.0024

−0.0011

(0.3451)

(0.0583)

(0.0426)

(0.0448)

(0.0426)

(0.2026)

(0.0625)

(0.0507)

Model 4

0.8931

0.0253

0.0523

2.4594

0.0523

1.0944

0.0364

0.0501

(0.8936)

(0.0659)

(0.0693)

(2.4607)

(0.0693)

(1.0957)

(0.0751)

(0.0736)

  1. (i)

    For Model 1, that is, there is no contamination, the RC estimator \(\hat \alpha ^{\text {RC}}_{j}\) for j=1,2 performs the best for estimating α in terms of the average bias; the MLE \(\hat \beta \) is the best one for estimating β, but the HL estimator \(\hat \beta ^{\text {HL}}\) behaves much similarly.

     
  2. (ii)

    For Models 2, 3, and 4, that is, contamination presents in the dataset, we observe that contamination induces a large influence on the average bias and RMSE of the non-robust estimators including the MLEs, especially in the presence of extreme outliers such as the fourth scenario, whereas it has a smaller impact on the proposed estimators.

     
  3. (iii)

    For the scale parameter estimation, the HL estimator \(\hat \beta ^{\text {HL}}\) outperforms the median estimators in terms of the RMSE, whereas both are quite robust against data contamination.

     

Which of the two robust estimators, the HL estimator or the median estimator, is preferable for the parameter β in the analysis of real lifetime data? Numerical results show that for all the cases considered in this paper, the HL estimator \(\hat \beta ^{\text {HL}}\) outperforms the median estimator \(\hat \beta ^{\mathrm {M}}\) in term of the RMSE. Additionally, the estimator of α developed based on \(\hat \beta ^{\text {HL}}\) also slightly outperforms the one using \(\hat \beta ^{\mathrm {M}}\) in most cases. We thus have a preference to recommend the HL estimator for β. It should be mentioned that other simulation results with respect to several other values of the parameter α and different sample sizes have also been conducted, and the conclusions are quite similar and are thus not provided here for brevity.

An illustrative example

We illustrate the practical application of the proposed estimators using a real data example. The dataset from Birnbaum and Saunders (1969b) is the fatigue lifetime of 6061−T6 aluminum coupons cut parallel to the direction of rolling and oscillated at 18 cycles per second. The dataset consists of 101 observations with maximum stress per cycle 31,000 psi and is presented in Table 4.
Table 4

Fatigue lifetime data by Birnbaum and Saunders (1969b)

70

90

96

97

99

100

103

104

104

105

107

108

108

108

109

109

112

112

113

114

114

114

116

119

120

120

120

121

121

123

124

124

124

124

124

128

128

129

129

130

130

130

131

131

131

131

131

132

132

132

133

134

134

134

134

134

136

136

137

138

138

138

139

139

141

141

142

142

142

142

142

142

144

144

145

146

148

148

149

151

151

152

155

156

157

157

157

157

158

159

162

163

163

164

166

166

168

170

174

196

212

 
The parameter estimates of α and β by all the methods under consideration are presented in Table 5. As mentioned in Section 3, we have a preference over the HL estimator \(\hat \beta ^{\text {HL}}\) for β, and thus we just analyze the results based on this estimator for simplicity. It can be seen from Table 5 that in the case of no contamination, most of the proposed estimators are in good agreement with the MLEs and that the estimators \(\left (\hat \alpha ^{\text {RC}}_{2}, ~\hat \beta ^{\text {HL}}\right)\) are slightly different.
Table 5

Comparison between the developed estimators and the MLE through fatigue lifetime data by Birnbaum and Saunders (1969b)

 

Table 4 data

Misrecorded data

Method

α

β

α

β

\((\hat \alpha ^{\text {MLE}}, ~\hat \beta ^{\text {MLE}})\)

0.1704

131.8188

0.2415

134.7689

\((\hat \alpha ^{\text {IQR}}_{1}, ~\hat \beta ^{\mathrm {M}})\)

0.1454

133.0000

0.1555

134.0000

\((\hat \alpha ^{\text {RC}}_{1}, ~\hat \beta ^{\mathrm {M}})\)

0.1601

133.0000

0.1677

134.0000

\((\hat \alpha ^{\text {IQR}}_{2}, ~\hat \beta ^{\text {HL}})\)

0.1454

132.6047

0.1555

132.8834

\((\hat \alpha ^{\text {RC}}_{2}, ~\hat \beta ^{\text {HL}})\)

0.1601

132.6047

0.1677

132.8834

To evaluate robustness of the proposed methods, we follow the same scenario by Dupuis and Mills (1998) and assume that the 51st observation t 51 was misrecorded as 633, instead of 133. It is desirable that the estimated shape and scale parameters should be very similar under the two scenarios, because we already know that the observation t 51 is a recording error. However, it has been observed from Table 5 that the MLEs are heavily distorted by this single outlier and resulted in \(\hat \alpha ^{\text {MLE}} = 0.2415\) and \(\hat \beta ^{\text {MLE}} = 134.7689\), far from the MLEs with t 51=133, whereas the proposed robust estimators \(\left (\hat \alpha ^{\text {RC}}_{2}, ~\hat \beta ^{\text {HL}}\right)\) and \(\left (\hat \alpha ^{\text {IQR}}_{2}, ~\hat \beta ^{\text {HL}}\right)\) still provided more reasonable results, which are quite close to the estimated values with t 51=133.

In Fig. 1, we plot the estimated parameters with the same data but where we replace the 51st observation t 51 by a range of values between 1 and 700. We observe that changing the value of t 51 induces a large impact on the behavior of the MLEs, whereas it has little influence on the proposed robust estimators. As expected, the proposed estimators \(\left (\hat \alpha ^{\text {IQR}}_{2}, ~\hat \beta ^{\text {HL}}\right)\) and \(\left (\hat \alpha ^{\text {RC}}_{2}, ~\hat \beta ^{\text {HL}}\right)\) have some built-in protection against a certain amount of deviation due to data contamination or the measurement errors. In conclusion, the performance of all the proposed estimators is quite satisfactory.
Fig. 1

Estimates for the BS parameters for the fatigue life data in (Birnbaum and Saunders 1969b), where the 51st observation is replaced by t 51=1,2,,700

Concluding remarks

In this paper, we have developed the two families of the estimators for the BS distribution, which are quite robust to data contamination. Unlike the MLEs, these estimators have simple closed-form expressions with higher breakdown points. For estimation of β, we have a preference for the use of the HL estimator \(\hat \beta ^{\text {HL}}\), because numerical results show that it remains more accurate than the median estimator \(\hat \beta ^{\mathrm {M}}\). Of all the considered estimators for α using the estimator \(\hat \beta ^{\text {HL}}\), we recommend the RC estimator \(\hat \alpha ^{\text {RC}}_{2}\), since it has a good trade-off between efficiency and robustness. It deserves to be mentioned that other proposed estimators of α are also attractive alternatives to the MLE in that they are highly efficient when the underlying model is true.

In summary, we have a preference for the RC and HL estimators \(\left (\hat \alpha ^{\text {RC}}_{2}, ~\hat \beta ^{\text {HL}}\right)\) for estimating (α, β), because it has been shown to be simple, very effective, and quite robust against model departure that often occurs in many practical situations. Note that censored data occur commonly in the field data from reliability tests, so a possible extension of the proposed estimators for the censored data will be investigated in the future.

Declarations

Acknowledgements

The authors thank the Editor Carl Lee and the two anonymous reviewers for their comments which have improved the appearance of this paper.

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

(1)
Department of Mathematical Sciences, Michigan Technological University
(2)
Department of Industrial Engineering, Pusan National University
(3)
Department of Mathematical Sciences, Clemson University

References

  1. Achcar, JA: Inferences for the Birnbaum-Saunders fatigue life model using Bayesian methods. Comput. Statist. Data Anal. 15, 367–380 (1993).MATHMathSciNetView ArticleGoogle Scholar
  2. Agostinelli, C, Marazzi, A, Yohai, VJ: Robust estimators of the generalized loggamma distribution. Technometrics. 56, 92–101 (2014).MathSciNetView ArticleGoogle Scholar
  3. Bhattacharyya, G, Fries, A: Fatigue failure models − birnbaum-saunders vs. inverse Gaussian. Reliability IEEE Trans. 31, 439–441 (1982).MATHView ArticleGoogle Scholar
  4. Bickel, PJ, Lehmann, EL: Descriptive statistics for non-parametric models III: Dispersion. Ann. Statist. 4, 1139–1158 (1976).MATHMathSciNetView ArticleGoogle Scholar
  5. Birnbaum, ZW, Saunders, SC: A new family of life distributions. J. Appl. Probability. 6, 319–327 (1969a).Google Scholar
  6. Birnbaum, ZW, Saunders, SC: Estimation for a family of life distributions with applications to fatigue. J. Appl. Probability. 6, 328–347 (1969b).Google Scholar
  7. Boudt, K, Caliskan, D, Croux, C: Robust explicit estimators of Weibull parameters. Metrika. 73, 187–209 (2011).MATHMathSciNetView ArticleGoogle Scholar
  8. Dupuis, D, Mills, J: Robust estimation of the Birnbaum-Saunders distribution. IEEE Trans. Reliab. 47, 88–95 (1998).View ArticleGoogle Scholar
  9. Engelhardt, M, Bain, LJ, Wright, FT: Inferences on the parameters of the Birnbaum-Saunders fatigue life distribution based on maximum likelihood estimation. Technometrics. 23, 251–256 (1981).MATHMathSciNetView ArticleGoogle Scholar
  10. Ghosh, JK: A new proof of the Bahadur representation of quantiles and an application. Ann. Math. Statist. 42, 1957–1961 (1971).MATHMathSciNetView ArticleGoogle Scholar
  11. Hodges, JL Jr, Lehmann, EL: Estimates of location based on rank tests. Ann. Math. Statist. 34, 598–611 (1963).MathSciNetView ArticleGoogle Scholar
  12. Lawson, C, Keats, J, Montgomery, D: Comparison of robust and least-squares regression in computer-generated probability plots. Reliability IEEE Trans. 46, 108–115 (1997).View ArticleGoogle Scholar
  13. Lio, YL, Park, C: A bootstrap control chart for Birnbaum-Saunders percentiles. Qual. Reliability Eng Int. 24, 585–600 (2008).View ArticleGoogle Scholar
  14. Ng, HKT, Kundu D, Balakrishnan, N: Modified moment estimation for the two-parameter Birnbaum-Saunders distribution. Comput. Statist. Data Anal. 43, 283–298 (2003).MATHMathSciNetView ArticleGoogle Scholar
  15. Park, C, Padgett, W: Stochastic degradation models with several accelerating variables. Reliability, IEEE Trans. 55, 379–390 (2006).View ArticleGoogle Scholar
  16. R Development Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2011). ISBN 3-900051-07-0.Google Scholar
  17. Rieck, JR, Nedelman, JR: A log-linear model for the Birnbaum-Saunders distribution. Technometrics. 33, 51–60 (1991).MATHGoogle Scholar
  18. Rousseeuw, PJ, Croux, C: Alternatives to the median absolute deviation. J. Amer. Statist. Assoc. 88, 1273–1283 (1993).MATHMathSciNetView ArticleGoogle Scholar
  19. Saunders, SC: A family of random variables closed under reciprocation. J. Amer. Statist. Assoc. 69, 533–539 (1974).MATHView ArticleGoogle Scholar
  20. Shamos, MI: Geometry and statistics: problems at the interface. In Algorithms and complexity (Proc. Sympos., Carnegie-Mellon Univ., Pittsburgh, Pa., 1976). Academic Press, New York (1976).Google Scholar
  21. Wang, M, Zhao, J, Sun, X, Park, C: Robust explicit estimation of the two-parameter Birnbaum-Saunders distribution. J Appl. Stat. 40, 2259–2274 (2013).MathSciNetView ArticleGoogle Scholar

Copyright

© Wang et al. 2015