Tolerance intervals in statistical software and robustness under model misspecification

A tolerance interval is a statistical interval that covers at least 100ρ% of the population of interest with a 100(1−α)% confidence, where ρ and α are pre-specified values in (0, 1). In many scientific fields, such as pharmaceutical sciences, manufacturing processes, clinical sciences, and environmental sciences, tolerance intervals are used for statistical inference and quality control. Despite the usefulness of tolerance intervals, the procedures to compute tolerance intervals are not commonly implemented in statistical software packages. This paper aims to provide a comparative study of the computational procedures for tolerance intervals in some commonly used statistical software packages including JMP, Minitab, NCSS, Python, R, and SAS. On the other hand, we also investigate the effect of misspecifying the underlying probability model on the performance of tolerance intervals. We study the performance of tolerance intervals when the assumed distribution is the same as the true underlying distribution and when the assumed distribution is different from the true distribution via a Monte Carlo simulation study. We also propose a robust model selection approach to obtain tolerance intervals that are relatively insensitive to the model misspecification. We show that the proposed robust model selection approach performs well when the underlying distribution is unknown but candidate distributions are available.


Introduction
There are three types of statistical intervals commonly used in practice: confidence interval, prediction interval, and tolerance interval. Confidence intervals provide a range of values that are likely to include the unknown parameter with a specified degree of confidence, 100(1−α)%, based upon a random sample. A prediction interval is an interval, with a specified degree of confidence, 100(1 − α)%, that the single future observation or multiple future observations from a population will fall between. A tolerance interval covers at least a specified proportion, ρ (0 ≤ ρ ≤ 1), of the population with a specified degree of confidence, 100(1 − α)% with 0 ≤ α ≤ 1 (Hahn and Meeker 1991). It can be interpreted as we are 100(1 − α)% confidence that at least 100ρ% of the population will be within the interval. This tolerance interval can be denoted as a [ 100(1 − α)%] /[ 100ρ%] tolerance interval. For example, a quality engineer in a light bulb manufacturer needs to evaluate light bulbs' life spans. The engineer randomly collects a sample of 100 light bulbs and reports the times to failure. The engineer wants to calculate a 95%/99% lower tolerance bound, which is the burn time that at least 99% of all light bulbs exceed with 95% confidence. Suppose the lower tolerance bound based on a normal distribution is 1085.947, so the engineer can claim that at least 99% of all the light bulbs exceed approximately 1086 hours of burn time with 95% confidence (Minitab 18 Statistical Software 2017). Tolerance intervals would be of particular interest in setting limits on the process capability for a product manufactured in large quantities (Hahn and Meeker 1991). Therefore, the tolerance interval is widely used in statistical quality control. Despite the usefulness of tolerance intervals, the computation of tolerance intervals based on different distributional assumptions is not commonly implemented in statistical software packages. We found that only a few commonly used statistical software packages, such as Minitab (Minitab 18 Statistical Software 2017), R (R Core Team 2020) and SAS (SAS Institute Inc 2014), provides the computational procedures for tolerance intervals. The objective of this paper is twofold. First, we aim to compare different commonly used statistical software packages that offer computational procedures to compute tolerance intervals. Second, we evaluate the performance of tolerance intervals under model uncertainty and propose a robust model selection approach to compute the tolerance intervals.
The rest of this paper is organized as follows. In Section 2, we provide the notation for tolerance intervals and introduce the computation procedures available in commonly used statistical software packages. In Section 3, we evaluate the performance of tolerance intervals under model misspecification. In Section 4, we propose a model selection approach when the underlying probability model is unknown but some candidate models are available. Finally, in Section 5, some concluding remarks and future research directions are provided.

Basics of tolerance intervals
Let X 1 , X 2 , . . ., X n be a random sample of size n from a probability model with probability density function (PDF) f (x; θ ) and cumulative distribution function (CDF) F(x; θ ), where θ is the vector of parameters. We denote the observed values of X 1 , X 2 , . . . , X n as x 1 , x 2 , . . . , x n . In the case that the population mean μ and population standard deviation σ are unknown, these parameters are estimated by using the sample mean and sample standard deviation,x = n i=1 x i /n and s = n i=1 (x i −x) 2 /(n − 1), respectively. For example, for normally distributed data, a [ 100(1 − α)%] /[ 100ρ%] tolerance interval has the formx ± ks, where k is the tolerance factor, (1 − α) ∈ (0, 1) is the confidence level and ρ ∈ (0, 1) is the population proportion of interest. Usually, the exact value of k for given values of α and ρ is not easy to compute (with the one-sided normal setting being an exception), therefore, most tolerance intervals are calculated based on approximation methods (Young 2010). We define as the coverage of a two-sided interval [ L, U], where L and U are statistics computed from the sample. Then, a [ 100(1 − α)%] /[ 100ρ%] two-sided tolerance interval [ L, U] satisfies Similarly, a [ 100(1 − α)%] /[ 100ρ%] upper one-sided tolerance interval [ L, ∞] satisfies For specific values of α and ρ, the two-sided, upper one-sided, and lower one-sided tolerance intervals can be obtained by finding the values of L and U that satisfy Eqs. (1), (2), and (3), respectively, for a specified underlying distribution.
To construct the tolerance interval, instead of assuming the data are coming from a particular parametric model, one can obtain a nonparametric tolerance interval based on order statistics (see, for example, Section 7.2 of David and Nagaraja (2003)). Specifically, the upper and lower nonparametric [ 100(1 − α)%]/[100ρ%] tolerance limits are L = x r:n and U = x s:n , where x j:n is the j-th order statistic of the random sample x 1 , x 2 , . . . , x n and the values of r and s (r < s) are chosen to satisfy Eq. (1) (Hahn and Meeker 1991;David and Nagaraja 2003).

Parametric tolerance intervals for some particular distributions
To illustrate the calculation of the tolerance intervals based on different distributions, we consider four symmetric distributions with location and scale parameters: normal (Gaussian), Cauchy, logistic, and Laplace distributions; and three two-parameter skewed (asymmetric) distributions with shape and scale parameters: gamma, Weibull, and lognormal distributions. The functional form of these seven distributions and the corresponding computational formulas for the tolerance intervals based on these distributions are presented in the following. For more details of the computation of tolerance intervals based on different distributions, one may refer to Young (2014).
(1) Normal distribution: The PDF and CDF of a normal distribution with location parameter μ N and scale parameter σ N are, respectively, and F N (x; μ N where −∞ < x < ∞, −∞ < μ N < ∞ and σ N > 0. Based on a random sample of size n, X 1 , X 2 , . . . , X n , from the normal distribution with PDF and CDF in Eqs. (4) and (5), respectively, supposeμ N andσ N are the corresponding sample mean and sample standard deviation, then the lower and 1 + n −1 n − 1 χ 2 n−1;α 1 + n − 3 − χ 2 n−1;α 2(n + 1) 2 , and χ 2 d;p is the p-th upper percentile of the chi-square distribution with d degrees of freedom. Note that the two-sided tolerance interval with tolerance factor k N,2,α,ρ is an approximation. For the other ways to approximate the tolerance factor, one can refer to Section 2.3 of Krishnamoorthy and Mathew (2009).

Note that tolerance intervals under logistic distribution cannot be calculated if
Laplace distribution: The PDF and CDF of a logistic distribution with location parameter μ P and scale parameter σ P are, respectively, and where −∞ < x < ∞, −∞ < μ P < ∞ and σ P > 0.
(5) Gamma distribution: The PDF and CDF of the gamma distribution with parameters θ G and β G are, respectively, and where x > 0, θ G > 0 is the shape parameter, β G > 0 is the scale parameter, and (a) = ∞ 0 t a−1 e −z dt is the gamma function. For gamma distribution, the tolerance intervals can be obtained through the normal tolerance interval by considering a transformation of random variable (Krishnamoorthy et al. 2008). Suppose X is a gamma random variable with PDF and CDF in Eqs. (14) and (15), then X 1/3 can be approximated by a normal distribution with mean μ N and variance σ 2 N defined as Based on a random sample X 1 , X 2 , . . . , X n from gamma distribution, we first obtain the maximum likelihood estimates of the parameters θ G and β G , denoted asθ G and β G , respectively. Then, we substitute θ G and β G byθ G andβ G into Eq. (16) to obtain μ N andσ 2 N . After that, the one-sided and two-sided tolerance intervals for normal distribution (the upper and lower limits are denoted as L N and U N , respectively) can be obtained from Eqs. (6) and (7), respectively, based onμ N andσ 2 N . The lower and upper [ 100(1 − α)%] /[ 100ρ%] tolerance limits based on gamma distribution can be obtained as (6) Weibull distribution: The PDF and CDF of the Weibull distribution with parameters β W and θ W are, respectively, and where x > 0, θ W > 0 is the shape parameter and β W > 0 is the scale parameter.
Based on a random sample X 1 , X 2 , . . . , X n from Weibull distribution, we first obtain the maximum likelihood estimates of the parameters θ W and β W , denoted asθ W and β W , respectively. Then, the lower and upper one- tolerance intervals can be obtained as: where λ ρ = ln(− ln(ρ)). A two-sided tolerance interval based on Weibull distribution can be obtained by replacing α by α/2 and ρ by (ρ + 1)/2 in the above formulas for computing L W and U W . (7) Lognormal distribution: The PDF and CDF of the lognormal distribution with parameters μ LN and σ LN are, respectively, and where x > 0, σ LN is the shape parameter (and is the standard deviation of the log of the distribution), μ LN ∈ (−∞, ∞) is the scale parameter (and is also the median of the distribution).
Based on a random sample X 1 , X 2 , . . . , X n from lognormal distribution, we can obtain the maximum likelihood estimates of the parameters μ LN and σ LN , denoted asμ LN andσ LN , respectively. Then, the one-sided and two-sided tolerance intervals for normal distribution (the upper and lower limits are denoted as L N and U N , respectively) can be obtained from Eqs. (6) and (7), respectively, based onμ LN and σ 2 LN . The tolerance intervals based on lognomral distribution can be computed using the fact that Y = ln X follows a normal distribution if X follows a lognormal distribution, i.e., the lower and upper [ 100(1 − α)%] /[ 100ρ%] tolerance limits based on lognormal distribution can be obtained as

Available statistical software packages
There are several statistical software packages that can provide the computation of tolerance intervals. In this subsection, we discuss several commonly used statistical software packages, including JMP (JMP Version 16, 2021), Minitab (Minitab 18 Statistical Software, 2017), NCSS (NCSS 2021Statistical Software, 2021, Python (Python Core Team, 2015), R (R Core Team, 2020), and SAS (SAS Institute Inc, 2014), that provide computational procedures to calculate tolerance intervals based on various distributions.
All these six software packages discussed here provide computational procedures of tolerance intervals for normal distribution and nonparametric tolerance intervals. In R (R Core Team, 2020), the package tolerance (Young 2010; provides the with the 'HM' method, i.e., method = "HM"). For lognormal distribution, the formula used in Minitab corresponds to the R function normtol.int in the tolerance package with the 'EXACT' method and setting log.norm = TRUE (i.e., method = "EXACT", log.norm = T), while Python obtains the tolerance intervals based on log-transformation of the tolerance intervals for normal distribution. For the other distributions, however, Minitab and R use different computational formulas to obtain the tolerance intervals. The corresponding references for the formulas used in different software and the equivalence of the resulting tolerance intervals obtained from different software (grouping in parentheses) are summarized in Table 2.

Monte Carlo simulation studies
In this section, Monte Carlo simulation studies are used to evaluate the performance of tolerance intervals under different distributions in terms of the empirical confidence levels and population proportions of interest. Specifically, we evaluate the performance of tolerance intervals by assessing the closeness of the empirical probability Pr[ C(L, U; θ)] to ρ and the empirical probability 1 − Pr[ C(L, U; θ) ≥ ρ] to α. We consider the cases that the assumed distribution is the same as the true underlying distribution and the assumed distribution is different from the true underlying distribution. In this simulation study, we generate random samples of size n from the statistical distributions F and compute the one-and two-sided tolerance intervals based on the distribution G, i.e, F is the true underlying distribution and G is the assumed distribution. As the true underlying distribution is usually unknown and not specified in practice, we compare the coverage of the tolerance interval to determine the robustness of the tolerance interval for different distributions.
Here, we consider a simulation study for symmetric distributions (normal, Cauchy, logistic, and Laplace distributions) and a simulation study for skewed distributions (gamma, Weibull, and lognormal distributions). For symmetric distributions, we consider the standard distributions by setting the location parameter to be 0 and the scale parameter to be 1. For skewed distributions, we consider the parameter settings based on the parameter estimates in a real data example presented in Section 6.2 (see , Table 25). Specifically, the following procedure is used in the Monte Carlo simulation study to evaluate the performance of the tolerance intervals for fixed values of α and ρ: (i) Generate a random sample of size n, (x 1 , x 2 , . . . , x n ), from the true underlying distribution F ; (ii) Compute the tolerance interval using the sample (x 1 , x 2 , . . . , x n ) based on the assumed distribution G. The tolerance interval obtained in the h -th simulation is denoted as [ L (h) , U (h) ]; (iii) Obtain the probability that the random variable follows distribution F falls in between the upper and lower limits, i.e., C( The simulation results are based on M = 10000 except for the normal tolerance intervals in R with 'EXACT' and 'OCT' methods due to the long computation time of these exact procedures in which M = 1000 is used. For each setting, the following quantities are computed for comparison purposes:   (Hoew 1969), (Wald and Wolfowitz 1946), (Weissberg and Beatty 1969)  Nonparametric (Faulkenberry and Daly 1970), (Wilks 1941b), (Robbins 1944), (Krishnamoorthy and Mathew 2009) (Hahn and Meeker 1991), (Bury 1999), (Wald 1943), (Wilks 1941a), (Young and  Smallest extreme value (Lawless 1975) (Bain and Engelhardt 1981) , (Coles 2001) Weibull (Lawless 1975) (Bain and Engelhardt 1981) , (Coles 2001) Largest extreme value (Lawless 1975) (Bain and Engelhardt 1981) , (Coles 2001) Logistic (Bain and Englehardt 1991) (Balakrishnan 1992) , (Hall 1975) Loglogistic (Bain and Englehardt 1991) (Balakrishnan 1992) , (Hall 1975) Cho We consider n = 10, 25, 50 and 100, α = 0.01, 0.05, 0.1 and 0.2, and ρ = 0.9, 0.95, 0.99 and 0.995 in both the simulation studies for symmetric distributions and skewed distributions. If the tolerance intervals are performed as expected, the value of theα should be close to the corresponding α with a smaller value ofα is preferred, and the value ofρ should be close to the corresponding ρ with larger value ofρ is preferred. Moreover, the tolerance interval that gives smaller value ofŝ is preferred. To make it easier to assess the performance of different tolerance intervals and to take into account the Monte Carlo simulation errors, in the tables for those simulation results, we highlight those values ofα within ±2 √ α(1 − α)/M and those values ofρ within ±2 √ ρ(1 − ρ)/M in bold.

Simulation results and discussions
The simulation results under different settings when the assumed distribution is the same as the underlying distribution (i.e., F = G) are presented in Tables 3, 4, 5, 6, 7, 8 and 9. When the assumed distribution and the true underlying distribution are the same, we would expect the value ofα should be close to α and the value ofρ should be close to ρ. However, we observe from Tables 3, 4, 5, 6, 7, 8 and 9 that when the sample size n is small, α can be larger than α under the correct model assumption. For example, in Table 4, when the underlying distribution is Cauchy with PDF and CDF in Eqs. (8) and (9), α = 0.05, ρ = 0.9 and n = 10, the value ofα is 0.1500. For moderate to large sample sizes (i.e., n = 50 and n = 100), the values ofα are close or even smaller than the values of α in most cases. For the values ofρ, we observe that the values ofρ are always greater than ρ under the correct model assumption. For the standard deviationŝ, the value decreases as the sample size n increases. For the sake of saving space, we only present some representative simulation results under different settings when the assumed distribution is different from the underlying distribution (i.e., F = G) in Tables 10, 11, 12 and 13, and the simulation results for other settings are presented in the Appendix (Tables 26-38). From Tables 10 and 11, we observe that the tolerance intervals computed under Cauchy and logistic distributions are robust to model misspecification when the true underlying distribution is normal. In Tables 10, 11, 12 and 13, the values ofα are less than or equal to α and the values ofρ are larger than ρ. However, the simulation results show that when the tolerance intervals are not robust under model misspecification in general. In Table 12, when the underlying true distribution is Cauchy (F: Cauchy) and the tolerance intervals are computed based on assuming normal distribution (G: Normal), the performance of tolerance intervals may not be satisfactory in terms of the closeness ofα andρ to α and ρ, respectively. For example, in Table 12, when ρ = 0.99, α = 0.1 and n = 50, the value ofα is 0.7921, which is much larger than the desired level α = 0.1 and the value ofρ is 0.9696, which is smaller than the specific proportion ρ = 0.99. Similar observations are obtained based on the results presented in Table 13 and Tables 26-38 in the Appendix for both symmetric and asymmetric distributions. Table 3 Performance of tolerance intervals based on normal distribution when the true distribution is normal (F = G: Normal) n = 10 n = 25 n = 50 n = 100 n = 10 n = 25 n = 50 n = 100 Based on the simulation results in this section, when the true distribution is different from the assumed distribution, the parametric tolerance intervals can be sensitive to the model misspecification and the performance of the tolerance intervals can be problematic in terms of the covering proportion ρ and the degree of confidence α. Hence, it is desired to develop an appropriate approach to compute the tolerance interval when the true underlying distribution is unknown. To address the issue of model uncertainty in practice, one plausible solution is using the nonparametric tolerance interval which does not require a distributional assumption. From a simulation study for the performance of nonparametric tolerance interval under the four symmetric distributions considered here (results are presented in the Appendix, Tables 39-42), the nonparametric tolerance intervals do not perform as well as the parametric tolerance intervals computed under the correct distributional assumption, i.e., G = F, and the values ofα andρ can be far from the pre-specified values. For the aforementioned reasons, we propose a model selection approach when there are potential candidate distributions under consideration.

Model selection based on maximum likelihood
In this section, we propose a simple model selection approach based on the maximum likelihood for the construction of tolerance intervals under model uncertainty in order to reduce the negative effect of model misspecification. We calculate the maximum likelihood of each candidate distribution and choose the distribution that has the largest likelihood. In other words, we are choosing a distribution that is most likely to be the true distribution when the true distribution is unknown. Then, we calculate the tolerance interval based on the selected distribution. The proposed model selection approach is summarized as follows: (1) Based on the random sample x 1 , x 2 , . . . , x n , compute the values of maximum log-likelihood for each of the candidate distributions. For example, for the four symmetric distributions considered here, we have the value of maximum log-likelihood based on normal distribution the value of maximum log-likelihood based on Cauchy distribution and the value of maximum log-likelihood based on Laplace distribution The maximum log-likelihood based on the asymmetric distributions considered here can be computed in a similar manner.
(2) Select the distribution that gives the largest value of the maximum log-likelihood as the assumed distribution G and compute the tolerance interval based on distribution G.
Since the candidate models considered here have the same number of parameters, therefore, we use the values of the maximum likelihood for model selection. When the candidate models have a different number of parameters, some model selection criteria that penalize the model for having more parameters such as the Akaike's information criterion (AIC) and the Bayesian information criterion (BIC) can be utilized for model selection.

Monte Carlo simulation study
In this subsection, we perform a simulation study as described in Section 4 to compute the values ofα,ρ andŝ based on the proposed model selection approach using maximum likelihood. For symmetric distributions, we consider the normal, Cauchy, logistic, and Laplace distributions as the candidate distributions. For skewed distributions, we consider the gamma, Weibull, and lognormal distributions as the candidate distributions. The simulated results under different settings are presented in Tables 14, 15 ,16,17,18,19 and 20. From Tables 14,15,16,17,18,19 and 20, we observe that the performance of the tolerance intervals computed based on the proposed model selection approach is not as good as the tolerance intervals when the assumed distribution and the true underlying distribution are the same, however, the performance of the tolerance intervals computed based on the proposed model selection approach is better than the tolerance intervals under model misspecification. For example, for 95%/99% tolerance intervals with sample size n = 50 and the true underlying distribution is Cauchy, the values ofα,ρ andŝ are 0.0661, 0.9925 and 0.0016, respectively, when the assumed distribution is Cauchy (see ,  Table 7 Performance of tolerance intervals based on logistic distribution when the true distribution is logistic (F = G: Gamma) n = 10 n = 25 n = 50 n = 100 n = 10 n = 25 n = 50 n = 100  Table 4), the values ofα,ρ andŝ are 0.7823, 0.9708 and 0.0224, respectively, when the assumed distribution is normal (see , Table 12), and the values ofα,ρ andŝ are 0.1227, 0.9894 and 0.0129, respectively, when the proposed model selection approach is used (see , Table 15). We can see that the proposed model selection approach can effectively reduce the risk of model misspecification in the computation of tolerance intervals. Moreover, the performance of the tolerance intervals based on the proposed model selection approach can be better than the nonparametric tolerance intervals (e.g., for 95%/99% tolerance intervals with sample size n = 50 and the true underlying distribution is Cauchy, the values ofα,ρ andŝ are 0.3884, 0.9879 and 0.0145, respectively). However, when compared with the nonparametric tolerance interval, the proposed model selection approach requires the specification of some suitable candidate distributions.

Illustrative examples
In this section, two numerical examples are used to compare the computations of tolerance intervals using different software packages and illustrate the proposed model selection approach.

Differences in flood levels data
In this example, we consider 33 differences in flood levels between two stations on Fox river which streams through Wisconsin. The data was originally gathered by  Gumbel and Mustafi (1967), which were also discussed by Bain and Engelhardt (1973) and Puig and Stephens (2000). The dataset is presented in Table 21. We assume the data is coming from a normal distribution or a logistic distribution and compute the corresponding parametric tolerance intervals using JMP, Minitab, NCSS, Python, R, and SAS. For the computation in R, the functions normtol.int, logistol.int and nptol.int in the tolerance package with different method options are used to compute the parametric tolerance intervals based on normal and logistic distributions, respectively (Young 2010;. The 95%/95% tolerance intervals computed based on the data in Table 21 from different software packages are presented in Table 22. For tolerance intervals under normal distribution, we observe that the resulting intervals from JMP, Minitab, SAS, and R with method = "EXACT" are the same, while the resulting intervals from NCSS, Python, and R with method "HE" are the same. However, the tolerance intervals computed under logistic distribution are different in Minitab and R.
To illustrate the proposed model selection approach, we consider that the normal and logistic distributions as the candidate distributions. The maximum likelihood estimates of the parameters μ N and σ N for the normal distribution are 9.3536 and 4.0205, respectively, and the value of maximum log-likelihood is -92.2417. The maximum likelihood estimates of the parameters μ L and σ L for logistic distribution are 9.4048 and 2.3611, respectively, and the value of maximum log-likelihood is -93.3586. Based on the values of the maximum log-likelihood, we select the logistic distribution over the normal distribution, and hence, we report the tolerance interval computed based on the logistic distribution for this data set.

Locomotive controls failure data
To illustrate the computation of tolerance intervals using Minitab, Python, and R for asymmetric distributions (JMP, NCSS, and SAS are not included since they only provide tolerance intervals based on the normal distribution), we consider a lifetime data set for locomotive controls. Nelson (1982) presented the miles to failure of 37 locomotive controls. This data set was also discussed by Krishnamoorthy and Xie (2011) and Yuan et al. (2018). The data set is presented in Table 23.
For illustrative purposes, we consider three commonly used lifetime distributions, the gamma, Weibull, and lognormal distributions, as candidate models for the lifetime of locomotive controls. The 95%/95% tolerance intervals under the gamma, Weibull, To apply the proposed model selection approach, we compute the maximum likelihood estimates of the model parameters and the corresponding values of maximum log-likelihood under the gamma, Weibull, and lognormal distributions, and present the results in Table 25. Since the Weibull distribution gives the largest likelihood among the three candidate models, we select the Weibull distribution and report the tolerance interval based on the Weibull distribution. Based on the results from R, the 95%/95% tolerance interval based on Weibull distribution is (23.884, 171.782). We are 95% confident that 95% of the locomotive controls will have lifetimes that are between 23884 miles and 171782 miles. If this does not satisfy the requirements of the railroad company, then the reliability of the locomotive controls needs to be improved in the manufacturing process.

Concluding remarks
In this paper, we discuss the computation of tolerance intervals available in commonly used statistical software packages including JMP, Minitab, NCSS, Python, R, and SAS. We evaluate the performance of tolerance intervals using Monte Carlo simulation under model misspecification by considering four symmetric distributions: normal, Cauchy, logistic, and Laplace distributions, and three asymmetric distributions: gamma, Weibull, and lognormal distributions. We observe that the performance of parametric tolerance intervals can be sensitive to model misspecification. Therefore, when the true underlying distribution is unknown and some candidate distributions are available, we propose a simple model selection approach and show that the proposed approach can effectively reduce the negative effect of misspecifying the underlying distribution in the performance of tolerance intervals. The computation of the tolerance intervals using different statistical software packages and the proposed model selection approach are illustrated by two numerical examples. For future research, we can compare the performance of the tolerance intervals obtained from different software packages with complete and incomplete data.           Table 22 The 95%/95% tolerance intervals based on the data in Table 21 computed using different software packages    Table 24 The 95%/95% tolerance intervals for the data set in Table 23 computed Table 26 Performance of tolerance intervals based on logistic distribution when the true distribution is normal (G: Logistic; F: Normal) n = 10 n = 25 n = 50 n = 100 n = 10 n = 25 n = 50 n = 100 ρ = 0.9 ρ = 0.95