Open Access

Quantifying spread in three-dimensional rotation data: comparison of nonparametric and parametric techniques

Journal of Statistical Distributions and Applications20152:9

DOI: 10.1186/s40488-015-0032-x

Received: 27 July 2015

Accepted: 28 September 2015

Published: 6 October 2015

Abstract

While there have been recent advances in distributional theory and parametric inference for 3-D rotation data, applications still exist where it is difficult to identify an appropriate distribution for modeling. In these instances, nonparametric inference may be preferred. In this paper, a measure of spread for 3-D rotation data, called the average misorientation angle, is introduced and bootstrapping is developed for this measure. Existing parametric inference methods for estimating spread in 3-D rotations are compared to the bootstrapping procedure through a simulation study. The bootstrapping technique is then used in a materials science application where existing distributions do not appear to provide an adequate fit.

Keywords

3-D rotations Bootstrapping Matrix Fisher distribution UARS distributions

Mathematics Subject Classification (MSC)

62G09 62G05 62G15

Introduction

Data in the form of three-dimensional rotations are common in the areas of materials science (e.g., crystal orientations in metals, Demirel et al. 2000; Wilson and Spanos 2001) and human kinematics (Rancourt et al. 2000; Rivest et al. 2008; Haddou et al. 2010). Despite the prevalence of 3-D rotations in such fields, distributional developments for such data have remained rather stagnant over the years. Most works regarding 3-D rotations rely on the matrix Fisher distribution (Khatri and Mardia 1977; Jupp and Mardia 1979; Prentice 1986; Mardia and Jupp 2000; Rancourt et al. 2000) which was introduced by Downs (1972). León et al. (2006) introduced the Cayley distribution for 3-D rotations several years later. Recognizing the limitations of existing distributions for 3-D rotations, Bingham et al. (2009a) developed the Uniform Axis-Random Spin (UARS) class of distributions. While the UARS class provides flexibility in modeling 3-D rotations that was not available before its development, applications may still exist where it is difficult to identify an appropriate member of the UARS class for modeling (see Section 5 for particular examples). In such cases, using nonparametric methods for estimation may be advantageous. This paper focuses specifically on estimation of the spread in 3-D rotations and two distributions for 3-D rotations, the matrix Fisher and von Mises version of the UARS class, are studied primarily. The next section gives an overview of these two distributions.

Overview of distributions for 3-D rotations

The matrix Fisher distribution is the most commonly used distribution for 3-D rotations (Khatri and Mardia 1977; Jupp and Mardia 1979) and the symmetric version of it is a member of the UARS class developed by Bingham et al. (2009a). Therefore, the following discussion of distributions for 3-D rotations is facilitated through the UARS class.

A random rotation O SO(3), where SO(3) denotes the set of all 3×3 orthogonal rotation matrices, from a UARS distribution with center at matrix S can be written as O=S P, where
$$\mathbf{P}=\mathbf{U}\mathbf{U}^{T}+\left(\mathbf{I}_{3\,\times\,3}-\mathbf{U}\mathbf{U}^{T}\right) \cos r+\left(\begin{array} {ccc}0 & -u_{3} & u_{2} \\ u_{3} & 0 & -u_{1} \\ -u_{2} & u_{1} & 0 \end{array} \right)\sin r \in \mathrm{SO(3)} $$
is obtained by rotating the 3 × 3 identity matrix, I 3 × 3, about an axis \(\mathbf {U}=(u_{1},u_{2},u_{3})^{T} \in \mathbb {R}^{3}\) by a random angle r(−π,π]. Here U is uniformly distributed on the unit sphere and r follows some circular distribution that is symmetric about 0 with spread depending on parameter κ. The parameter κ is called a concentration parameter, with larger values indicating less spread in the rotations.
Suppose the circular distribution for r(−π,π] has density C(r|κ). Then a matrix density for O UARS(S,κ) is given by
$$ f(\mathbf{o}|\mathbf{S}, \kappa)= \frac{4\pi}{3-tr(\mathbf{S}^{T}\mathbf{o})} C\left(\arccos \left[2^{-1}(tr(\mathbf{S}^{T} \mathbf{o})-1)\right]| \kappa \right), \ \ \mathbf{o}\in SO(3) $$
(1)

with respect to the Haar measure (Bingham et al. 2009a).

A particular distribution for 3-D rotations is obtained by specifying choice of circular density C(r|κ). For the matrix Fisher distribution,
$$ C(r|\kappa)={\frac{(1-\cos r)\exp(2\kappa \cos r)}{2\pi \left(I_{0}(2\kappa)-I_{1}(2\kappa)\right)}}, \ r \in (-\pi,\pi] $$
(2)

where I i denotes the modified Bessel function of order i. For large κ, this density is approximately the Maxwell-Boltzmann density with scale parameter \(1/\sqrt {2\kappa }\) (Bingham et al. 2010).

The von Mises circular distribution is one of the most commonly used circular distributions due to the fact that it approaches the normal distribution as κ (Fisher 1996). The von Mises distribution has density
$$ C(r|\kappa)=\left[2\pi I_{0}(\kappa) \right]^{-1} \exp \left[\kappa\cos(r) \right],\ r\in \left(-\pi,\pi \right] $$
(3)

and corresponds to the von Mises version of the UARS class (vM-UARS) studied by Bingham et al. (2009a) and Bingham et al. (2009b).

In the next section, a possible measure of spread for 3-D rotations is introduced and the steps for obtaining a bootstrap confidence interval are outlined. The matrix Fisher and vM-UARS distributions are then used in a simulation study in Section 4 to compare the bootstrapping technique to existing parametric approaches in the literature.

Quantifying spread: the average misorientation angle and bootstrapping

In the literature, quantifying spread is typically tied to a specific distribution through estimation of the spread parameter for that distribution. For example, spread in the matrix Fisher distribution is considered by estimating the spread parameter κ. As such, a broadly defined point estimate for spread in 3-D rotations, that can be used regardless of distribution, is needed. Suppose that O 1,…,O n SO(3). The mean rotation, M, is a commonly used measure of center (Khatri and Mardia 1977; León et al. 2006; Bingham et al. 2009a) which is defined as the rotation that maximizes \(\text {trace} \left (\mathbf {M}^{T} \bar {\mathbf {O}}\right)\), where \(\bar {\mathbf {O}}=\frac {1}{n}\sum _{i=1}^{n} \mathbf {O}_{i}\). The mean rotation M can be found by using M=V W, where \(\bar {\mathbf {O}}=\mathbf {V}\boldsymbol {\Sigma }\mathbf {W}\) is the singular value decomposition of \(\bar {\mathbf {O}}\). Distance between each rotation O i , i=1,…,n, in a data set and the mean rotation M can be measured by the misorientation angle (in radians) between the two rotations, calculated as
$$ \operatorname{mis}(\mathbf{O}_{i},\mathbf{M})=\arccos\left(\frac{\text{trace}(\mathbf{O}'_{i}\mathbf{M})-1}{2}\right) $$
(4)
where O i′ is the transpose of O i . This misorientation angle is the smallest angle of rotation needed to get from O i to M via a spin about some axis. Now, the overall spread in the data set O 1,…,O n can be taken to be the average misorientation angle (AMA), \(\frac {1}{n} \sum _{i=1}^{n} \operatorname {mis}(\mathbf {O}_{i},\mathbf {M})\). For illustration purposes, Fig. 1 shows two different 3-D rotation data sets of size n=100 with (a) AMA =0.2013 and (b) AMA =0.0595. Here the 3-D rotations are plotted as points on the sphere, with one observation represented by three points that would correspond to three orthogonal axes.
Fig. 1

Plot of two simulated 3-D rotation data sets with AMA of (a) 0.2013 and (b) 0.0595

While (4) is used to calculate the AMA from a sample, the population AMA for a given UARS distribution can be obtained by considering the circular density, C(r|κ). Since C(r|κ), r(−π,π] is symmetric about 0, E(r)=0. However, we can think of all angles, r, as being positive since a spin of r about a vector V is equivalent to a spin of −r about vector −V. Therefore, we can find the population AMA for a particular UARS distribution with given κ and C(r|κ) as
$$ \text{AMA}=\int^{\pi}_{0}2rC(r|\kappa)dr. $$
(5)

In the instances of C(r|κ) in (2) and (3), this does not have a closed form.

Now that the AMA has been introduced for quantifying the spread in a 3-D rotation data set, bootstrapping techniques are discussed. Although bootstrapping has been used to create confidence regions in a wide variety of settings, including analyzing directional data such as p-dimensional unit vectors (Fisher and Hall 1989), its application to 3-D rotation data is limited. Recent works have focused on estimation of the central rotation (Will and Bingham 2015; Stanfil et al. 2015), but no attention has been given to using bootstrapping for estimating spread. The steps for the bootstrapping technique follow.
  1. 1.

    Resample from the 3-D rotation data set O 1,…,O n with replacement.

     
  2. 2.

    Calculate the AMA for the bootstrap sample obtained in step 1.

     
  3. 3.

    Repeat steps 1 and 2 a large number (say 1000) times.

     

After the AMA values are obtained for each bootstrap sample, this set can be used to construct a confidence interval. Under the bootstrap percentile method, a 95 % confidence interval is obtained by using the 2.5 th and 97.5 th percentiles as confidence bounds. Other methods, such as the central percentile method and bias-corrected method (Efron 1987), exist for obtaining confidence intervals from bootstrap samples, but these techniques showed no improvement over the simpler percentile method for the situations considered in the simulations that were conducted. In the next section, a simulation study is used to compare bootstrapping for the AMA to existing parametric methods for quantifying spread.

Comparison of nonparametric and parametric methods via simulation

Simulations were conducted by using both the matrix Fisher and vM-UARS distributions. The values of κ used were 1, 5, 20, and 500, which give a broad range of spread from very concentrated (κ=500) to very spread (κ=1). Although existing works that use parametric methods to quantify spread focus on the parameter κ (Bingham et al. 2009b; Bingham et al., 2010), the bootstrapping considered here uses the AMA. Therefore, we will convert between κ and the AMA. The AMA corresponding to each distribution and choice of κ were found using (5) and are displayed in Table 1.
Table 1

Population AMA for the matrix Fisher and vM-UARS distributions with the given κ

 

Matrix Fisher

vM-UARS

κ=1

1.407704

0.999947

κ=5

0.521488

0.375360

κ=20

0.254209

0.180353

κ=500

0.050463

0.035697

For all simulations the population central matrix S from (1) was set at the identity matrix, since the center used does not influence the results when estimating spread. Further, sample sizes of n= 10, 30, and 100 were used. For each distribution and (κ,n) pair, 1000 different data sets were simulated. For each data set, 1000 bootstrap replications were used, resulting in a bootstrap percentile confidence interval for the AMA. The proportion of intervals out of the 1000 that contained the population AMA are given in Tables 2 and 3 as the nonparametric coverage rate. The median width of the 1000 intervals is also given in these tables. In addition to the width being reported in terms of the AMA, the endpoints of each interval were put in terms of κ by solving the function in (5) for κ given the AMA. The median width in terms of κ was then also found. This conversion was done to allow for direct comparison to parametric methods discussed in the literature. Parametric results from other works are also provided in Tables 2 and 3 for ease of comparison between the nonparametric and parametric methods. The parametric results provided in Table 2 come from Bingham et al. (2010); Tables 3 and 4, pages 1324–1325, and the parametric results provided in Table 3 come from Bingham et al. (2009b); Tables 3 and 4, pages 616–617. The parametric results in these works were obtained by inversion of the likelihood ratio test (LRT).
Table 2

Coverage rates and median interval widths for estimating spread of the matrix Fisher distribution via nonparametric and parametric methods using various (κ,n) pairs

 

Nonparametric

Parametric

(κ,n)

Coverage rate

Median width (AMA)

Median width (κ)

Coverage rate

Median width (κ)

(1,10)

78.3

0.7061

1.3685

92.1

1.1621

(1,30)

89.5

0.4663

0.6709

95.0

0.6428

(1,100)

93.7

0.2635

0.3528

94.9

0.3475

(5,10)

82.2

0.2594

8.3839

91.1

5.6748

(5,30)

91.5

0.1584

3.3620

94.4

2.9355

(5,100)

93.2

0.0882

1.6469

94.9

1.5676

(20,10)

84.2

0.1257

33.4181

90.8

23.4370

(20,30)

89.7

0.0756

13.8188

95.1

12.0597

(20,100)

94.2

0.0419

6.7897

94.2

6.4390

(500,10)

83.7

0.0251

871.3614

92.2

591.3957

(500,30)

92.6

0.0151

346.8934

95.8

306.2049

(500,100)

94.2

0.0083

171.3630

95.0

162.3764

Table 3

Coverage rates and median interval widths for estimating spread of the vM-UARS distribution via nonparametric and parametric methods using various (κ,n) pairs

 

Nonparametric

Parametric

(κ,n)

Coverage rate

Median width (AMA)

Median width (κ)

Coverage rate

Median width (κ)

(1,10)

87.6

0.7838

2.421

94.8

2.274

(1,30)

93.9

0.5299

1.259

94.8

1.231

(1,100)

94.4

0.2993

0.669

95.5

0.665

(5,10)

85.3

0.2885

12.686

94.4

9.940

(5,30)

94.5

0.1987

5.452

94.9

4.979

(5,100)

95.7

0.1139

2.7565

95.3

2.599

(20,10)

87.4

0.1379

54.665

94.0

42.09

(20,30)

93.7

0.0939

22.913

95.5

21.36

(20,100)

94.1

0.0532

11.543

94.6

11.22

(500,10)

88.7

0.0278

1365.507

92.6

1075.0

(500,30)

94.0

0.0183

598.926

95.2

536.4

(500,100)

94.9

0.0104

299.603

94.2

283.3

Table 4

Coverage rates and median interval widths for the AMA when fitting the matrix Fisher distribution to vM-UARS data

(κ,n)

Coverage rate

Median width (AMA)

(1,10)

83.8

0.6305

(1,30)

82.8

0.3730

(1,100)

71.9

0.2072

(5,10)

67.3

0.2178

(5,30)

55.6

0.1285

(5,100)

25.5

0.0706

(20,10)

69.1

0.0992

(20,30)

54.7

0.0602

(20,100)

24.3

0.0333

(500,10)

66.2

0.0203

(500,30)

53.1

0.0118

(500,100)

21.4

0.0066

When considering the results for the matrix Fisher distribution provided in Table 2, we see that the coverage rates obtained through bootstrapping fail to reach nominal levels for the smaller sample sizes of n=10 and 30, but do well when n=100. The parametric approach produces coverage rates that are slightly too low (around 91–92 %) when n=10, with coverage rates around 95 % for larger samples. When considering the results for the vM-UARS distribution provided in Table 3, we see that the nonparametric rates are only too small when n=10, while the coverage rates for the parametric approach fluctuate around 95 % regardless of sample size. For both the matrix Fisher and vM-UARS distribution, the bootstrap intervals are slightly wider than the LRT intervals for small n, but comparable for larger sample sizes. Overall, the parametric methods outperform the nonparametric bootstrap for small sample sizes when the distributional assumptions are met (i.e. the correct distribution is fit to the data it was simulated from), but both perform as desired for larger samples.

Although the parametric methods perform well when the correct distribution is applied, this is not the case when data are modeled incorrectly. In addition to obtaining the bootstrap results, the LRT method of Bingham et al. (2010) was used to fit the matrix Fisher distribution to each of the vM-UARS samples. Coverage rates and median widths of the intervals for the AMA are given in Table 4. It can be seen that coverage rates are not near the desired 95 % for any of the cases, with rates as low as 25 % for large n. Only when the data is extremely spread (κ=1) are the rates moderately large. Therefore, bootstrapping far outperforms the parametric methods when distributional assumptions are not met.

In summary, for situations with small sample sizes, parametric methods are preferred to the bootstrap, provided that goodness-of-fit is investigated ahead of time to ensure the correct model is being used. See Bingham et al. (2009a) for a way to investigate goodness-of-fit for 3-D rotations through Q-Q plots. For larger sample sizes, the nonparametric methods perform just as well as the parametric methods when the correct distribution is used for modeling, with far better performance when an incorrect distribution is applied. Although the extensive simulations reported here do only focus on two distributions for 3-D rotations, initial analysis was done using the Cayley distribution of León et al. (2006) and the Wrapped Trivariate Normal distribution of Qiu et al. (2014), with similar findings.

Application to electron backscatter diffraction data

Three-dimensional rotation data commonly arise in materials science when exploring the texture of a specimen of some polycrystalline material, such as metal. Through electron backscatter diffraction (EBSD), a fixed beam of electrons is diffracted off of the (crystallographic) lattice planes of the polycrystalline specimen. These images reveal information about texture or crystallographic preferred orientation (Randle 2003). An area of interest in regards to EBSD measurements is precision, as methods used for quantifying EBSD precision in the materials science literature are not standard. Bingham et al. (2009a) study precision in nickel and aluminum specimens by using parametric LRT methods, after showing that the vM-UARS distribution provides an adequate fit to the data sets in hand. However, since more of the focus of the paper by Bingham et al. (2009a) was on development of the UARS class than on the EBSD application, the data sets considered only make up a small subset of all EBSD data actually collected. When considering the larger data set of over 4,000 observations from a single scan on the nickel specimen, there are instances in which it is hard to find an adequate distributional fit to some subsets of the data.

When using EBSD, orientations close in proximity are classified as composing a grain when the misorientation angle between them is small, so that a grain is thought of as a homogeneous piece of material that produces observations which generally share a common orientation. Five different grains were isolated from the nickel specimen, giving samples of various sizes. Though crystallographic orientations are not generally a rotation but a coset of SO(3) (i.e., a set of crystal-symmetrically equivalent rotations), the data sets considered here have been preprocessed to eliminate the 24-fold ambiguity. It is also worth noting that the rotations will be considered as independent, when in fact there may be spatially induced dependence.

For each of the five samples, attempts were made to find an adequate distributional fit by examining Q-Q plots. The misorientation angles were extracted from each data set and various circular distributions were fit to the data (i.e. various forms of C(r|κ) from (1) were tried). The theoretical quantiles were plotted against the sample misorientation angles and through these Q-Q plots it did not appear that there were good distributional matches. Figure 2 gives two of the Q-Q plots for a grain of size n=38. It can be seen that neither the matrix Fisher nor vM-UARS distribution seem to fit well to this data. Note that these were just two of several distributions that were explored.
Fig. 2

Q-Q plots of two fitted distributions against misorientation angles from a grain with n=38

Given that appropriate distributional forms for the five grains in consideration could not be found, it is best to quantify spread in these data sets through nonparametric methods. The bootstrap procedure of Section 3 was used on each of these data sets, resulting in a confidence interval for the AMA. These intervals are reported in both radians and degrees in Table 5. The within-grain precision estimates found here are comparable to the 0.5°−1° range reported in the literature (Demirel et al. 2000; Wilson and Spanos 2001; Bingham et al. 2009a).
Table 5

Confidence intervals for the AMA in both radians and degrees for five nickel grains of various sizes

Grain

Sample size

CI for AMA (radians)

CI for AMA (degrees)

1

38

(0.00638, 0.00796)

(0.36555, 0.45607)

2

47

(0.00640, 0.00845)

(0.36669, 0.48415)

3

42

(0.00741, 0.01328)

(0.42456, 0.76089)

4

34

(0.00547, 0.01298)

(0.31341, 0.74369)

5

27

(0.00886, 0.01595)

(0.50764, 0.91387)

Conclusion

While there have been recent advances in distributional theory and parametric inference for 3-D rotation data, instances still exist where it may be difficult to find an appropriate model to fit to a data set. As shown in Section 4, coverage rates for confidence intervals for estimating spread can fall below nominal levels when incorrect distributions are used. In such cases, nonparametric methods such as bootstrapping can provide a good alternative. However, for small samples (around n=10), applying the correct parametric procedure is better than using bootstrapping, provided that a good distributional fit can be found.

Declarations

Acknowledgments

This research was supported by NSF grant DMS-1104409.

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

(1)
University of Wisconsin-La Crosse

References

  1. Bingham, MA, Nordman, DJ, Vardeman, SB: Modeling and inference for measured crystal orientations and a tractable class of symmetric distributions for rotations in three dimensions. J. Am. Stat. Assoc. 104, 1385–1397 (2009a).Google Scholar
  2. Bingham, MA, Vardeman, SB, Nordman, DJ: Bayes one-sample and one-way random effects analyses for 3-D orientations with application to materials science. Bayesian Anal. 4, 607–630 (2009b).Google Scholar
  3. Bingham, MA, Nordman, DJ, Vardeman, SB: Finite-sample investigation of likelihood and Bayes inference for the symmetric von Mises-Fisher distribution. Comput. Stat. Data Anal. 54, 1317–1327 (2010).MATHMathSciNetView ArticleGoogle Scholar
  4. Demirel, MC, El-Dasher, BS, Adams, BL, Rollett, AD: Studies on the accuracy of electron backscatter diffraction measurements. In: Schwartz, AJ, Mukul, K, Adams, BL (eds.)Electron backscatter diffraction in materials Science. Kluwer Academic/Plenum Publishers, New York (2000).Google Scholar
  5. Downs, TD: Orientation statistics. Biometrika. 59, 665–676 (1972).MATHMathSciNetView ArticleGoogle Scholar
  6. Efron, B: Better bootstrap confidence intervals. J. Am. Stat. Assoc. 82, 171–185 (1987).MATHMathSciNetView ArticleGoogle Scholar
  7. Fisher, NI, Hall, P: Bootstrap confidence regions for directional data. J. Am. Stat. Assoc. 84, 996–1002 (1989).MathSciNetView ArticleGoogle Scholar
  8. Fisher, NI: Statistical analysis of circular data. Cambridge University Press, New York (1996).Google Scholar
  9. Will, LK, Bingham, M: Bootstrap techniques for measures of center for three-dimensional rotation data (2015). to appear in Involve.Google Scholar
  10. Haddou, M, Rivest, L-P, Pierrynowski, M: A nonlinear mixed effects directional model for the estimation of the rotation axes of the human ankle. Ann. Appl. Stat. 4, 1892–1912 (2010).MATHMathSciNetView ArticleGoogle Scholar
  11. Jupp, PE, Mardia, KV: Maximum likelihood estimators for the matrix von Mises-Fisher and Bingham distributions. Ann. Stat. 7, 599–606 (1979).MATHMathSciNetView ArticleGoogle Scholar
  12. Khatri, CG, Mardia, KV: The Von Mises-Fisher matrix distribution in orientation statistics. J. R. Stat. Soc. Ser. B. 39, 95–106 (1977).MATHMathSciNetGoogle Scholar
  13. León, CA, Massé, J-C, Rivest, L-P: A statistical model for random rotations. J. Multivariate Anal. 97, 412–430 (2006).MATHMathSciNetView ArticleGoogle Scholar
  14. Mardia, KV, Jupp, PE: Directional Statistics. Wiley, Chichester and New York (2000).MATHGoogle Scholar
  15. Prentice, MJ: Orientation statistics without parametric assumptions. J. R. Stat. Soc. Ser. B. 48, 214–222 (1986).MATHMathSciNetGoogle Scholar
  16. Qiu, Y, Nordman, DJ, Vardeman, SB: A wrapped trivariate normal distribution and Bayes inference for 3-D rotations. Stat. Sinica. 24, 897–917 (2014).MATHMathSciNetGoogle Scholar
  17. Rancourt, D, Rivest, L-P, Asselin, J: Using orientation statistics to investigate variations in human kinematics. J. R. Stat. Soc. Ser. C. 49, 81–94 (2000).MATHMathSciNetView ArticleGoogle Scholar
  18. Randle, V: Microtexture determination and its applications, Maney for the institute of materials. Minerals and Mining, London (2003).Google Scholar
  19. Rivest, L-P, Baillargeon, S, Pierrynowski, M: A directional model for the estimation of the rotation axes of the ankle joint. J. Am. Stat. Assoc. 103, 1060–1069 (2008).MATHMathSciNetView ArticleGoogle Scholar
  20. Stanfil, B, Genschel, U, Hofmann, H, Nordman, D: Nonparametric confidence regions for the central orientation of random rotations. J. Multivariate Anal. 135, 106–116 (2015).MathSciNetView ArticleGoogle Scholar
  21. Wilson, AW, Spanos, G: Application of orientation imaging microscopy to study phase transformations in steels. Mater. Charact. 46, 407–418 (2001).View ArticleGoogle Scholar

Copyright

© Bingham. 2015