Research  Open  Published:
Quantifying spread in threedimensional rotation data: comparison of nonparametric and parametric techniques
Journal of Statistical Distributions and Applicationsvolume 2, Article number: 9 (2015)
Abstract
While there have been recent advances in distributional theory and parametric inference for 3D rotation data, applications still exist where it is difficult to identify an appropriate distribution for modeling. In these instances, nonparametric inference may be preferred. In this paper, a measure of spread for 3D rotation data, called the average misorientation angle, is introduced and bootstrapping is developed for this measure. Existing parametric inference methods for estimating spread in 3D rotations are compared to the bootstrapping procedure through a simulation study. The bootstrapping technique is then used in a materials science application where existing distributions do not appear to provide an adequate fit.
Introduction
Data in the form of threedimensional rotations are common in the areas of materials science (e.g., crystal orientations in metals, Demirel et al. 2000; Wilson and Spanos 2001) and human kinematics (Rancourt et al. 2000; Rivest et al. 2008; Haddou et al. 2010). Despite the prevalence of 3D rotations in such fields, distributional developments for such data have remained rather stagnant over the years. Most works regarding 3D rotations rely on the matrix Fisher distribution (Khatri and Mardia 1977; Jupp and Mardia 1979; Prentice 1986; Mardia and Jupp 2000; Rancourt et al. 2000) which was introduced by Downs (1972). León et al. (2006) introduced the Cayley distribution for 3D rotations several years later. Recognizing the limitations of existing distributions for 3D rotations, Bingham et al. (2009a) developed the Uniform AxisRandom Spin (UARS) class of distributions. While the UARS class provides flexibility in modeling 3D rotations that was not available before its development, applications may still exist where it is difficult to identify an appropriate member of the UARS class for modeling (see Section 5 for particular examples). In such cases, using nonparametric methods for estimation may be advantageous. This paper focuses specifically on estimation of the spread in 3D rotations and two distributions for 3D rotations, the matrix Fisher and von Mises version of the UARS class, are studied primarily. The next section gives an overview of these two distributions.
Overview of distributions for 3D rotations
The matrix Fisher distribution is the most commonly used distribution for 3D rotations (Khatri and Mardia 1977; Jupp and Mardia 1979) and the symmetric version of it is a member of the UARS class developed by Bingham et al. (2009a). Therefore, the following discussion of distributions for 3D rotations is facilitated through the UARS class.
A random rotation O∈ SO(3), where SO(3) denotes the set of all 3×3 orthogonal rotation matrices, from a UARS distribution with center at matrix S can be written as O=S P, where
is obtained by rotating the 3 × 3 identity matrix, I _{3 × 3}, about an axis $\mathbf {U}=(u_{1},u_{2},u_{3})^{T} \in \mathbb {R}^{3}$ by a random angle r∈(−π,π]. Here U is uniformly distributed on the unit sphere and r follows some circular distribution that is symmetric about 0 with spread depending on parameter κ. The parameter κ is called a concentration parameter, with larger values indicating less spread in the rotations.
Suppose the circular distribution for r∈(−π,π] has density C(rκ). Then a matrix density for O∼ UARS(S,κ) is given by
with respect to the Haar measure (Bingham et al. 2009a).
A particular distribution for 3D rotations is obtained by specifying choice of circular density C(rκ). For the matrix Fisher distribution,
where I _{ i } denotes the modified Bessel function of order i. For large κ, this density is approximately the MaxwellBoltzmann density with scale parameter $1/\sqrt {2\kappa }$ (Bingham et al. 2010).
The von Mises circular distribution is one of the most commonly used circular distributions due to the fact that it approaches the normal distribution as κ→∞ (Fisher 1996). The von Mises distribution has density
and corresponds to the von Mises version of the UARS class (vMUARS) studied by Bingham et al. (2009a) and Bingham et al. (2009b).
In the next section, a possible measure of spread for 3D rotations is introduced and the steps for obtaining a bootstrap confidence interval are outlined. The matrix Fisher and vMUARS distributions are then used in a simulation study in Section 4 to compare the bootstrapping technique to existing parametric approaches in the literature.
Quantifying spread: the average misorientation angle and bootstrapping
In the literature, quantifying spread is typically tied to a specific distribution through estimation of the spread parameter for that distribution. For example, spread in the matrix Fisher distribution is considered by estimating the spread parameter κ. As such, a broadly defined point estimate for spread in 3D rotations, that can be used regardless of distribution, is needed. Suppose that O _{1},…,O _{ n }∈ SO(3). The mean rotation, M, is a commonly used measure of center (Khatri and Mardia 1977; León et al. 2006; Bingham et al. 2009a) which is defined as the rotation that maximizes $\text {trace} \left (\mathbf {M}^{T} \bar {\mathbf {O}}\right)$, where $\bar {\mathbf {O}}=\frac {1}{n}\sum _{i=1}^{n} \mathbf {O}_{i}$. The mean rotation M can be found by using M=V W, where $\bar {\mathbf {O}}=\mathbf {V}\boldsymbol {\Sigma }\mathbf {W}$ is the singular value decomposition of $\bar {\mathbf {O}}$. Distance between each rotation O _{ i }, i=1,…,n, in a data set and the mean rotation M can be measured by the misorientation angle (in radians) between the two rotations, calculated as
where O i′ is the transpose of O _{ i }. This misorientation angle is the smallest angle of rotation needed to get from O _{ i } to M via a spin about some axis. Now, the overall spread in the data set O _{1},…,O _{ n } can be taken to be the average misorientation angle (AMA), $\frac {1}{n} \sum _{i=1}^{n} \operatorname {mis}(\mathbf {O}_{i},\mathbf {M})$. For illustration purposes, Fig. 1 shows two different 3D rotation data sets of size n=100 with (a) AMA =0.2013 and (b) AMA =0.0595. Here the 3D rotations are plotted as points on the sphere, with one observation represented by three points that would correspond to three orthogonal axes.
While (4) is used to calculate the AMA from a sample, the population AMA for a given UARS distribution can be obtained by considering the circular density, C(rκ). Since C(rκ), r∈(−π,π] is symmetric about 0, E(r)=0. However, we can think of all angles, r, as being positive since a spin of r about a vector V is equivalent to a spin of −r about vector −V. Therefore, we can find the population AMA for a particular UARS distribution with given κ and C(rκ) as
In the instances of C(rκ) in (2) and (3), this does not have a closed form.
Now that the AMA has been introduced for quantifying the spread in a 3D rotation data set, bootstrapping techniques are discussed. Although bootstrapping has been used to create confidence regions in a wide variety of settings, including analyzing directional data such as pdimensional unit vectors (Fisher and Hall 1989), its application to 3D rotation data is limited. Recent works have focused on estimation of the central rotation (Will and Bingham 2015; Stanfil et al. 2015), but no attention has been given to using bootstrapping for estimating spread. The steps for the bootstrapping technique follow.

1.
Resample from the 3D rotation data set O _{1},…,O _{ n } with replacement.

2.
Calculate the AMA for the bootstrap sample obtained in step 1.

3.
Repeat steps 1 and 2 a large number (say 1000) times.
After the AMA values are obtained for each bootstrap sample, this set can be used to construct a confidence interval. Under the bootstrap percentile method, a 95 % confidence interval is obtained by using the 2.5 th and 97.5 th percentiles as confidence bounds. Other methods, such as the central percentile method and biascorrected method (Efron 1987), exist for obtaining confidence intervals from bootstrap samples, but these techniques showed no improvement over the simpler percentile method for the situations considered in the simulations that were conducted. In the next section, a simulation study is used to compare bootstrapping for the AMA to existing parametric methods for quantifying spread.
Comparison of nonparametric and parametric methods via simulation
Simulations were conducted by using both the matrix Fisher and vMUARS distributions. The values of κ used were 1, 5, 20, and 500, which give a broad range of spread from very concentrated (κ=500) to very spread (κ=1). Although existing works that use parametric methods to quantify spread focus on the parameter κ (Bingham et al. 2009b; Bingham et al., 2010), the bootstrapping considered here uses the AMA. Therefore, we will convert between κ and the AMA. The AMA corresponding to each distribution and choice of κ were found using (5) and are displayed in Table 1.
For all simulations the population central matrix S from (1) was set at the identity matrix, since the center used does not influence the results when estimating spread. Further, sample sizes of n= 10, 30, and 100 were used. For each distribution and (κ,n) pair, 1000 different data sets were simulated. For each data set, 1000 bootstrap replications were used, resulting in a bootstrap percentile confidence interval for the AMA. The proportion of intervals out of the 1000 that contained the population AMA are given in Tables 2 and 3 as the nonparametric coverage rate. The median width of the 1000 intervals is also given in these tables. In addition to the width being reported in terms of the AMA, the endpoints of each interval were put in terms of κ by solving the function in (5) for κ given the AMA. The median width in terms of κ was then also found. This conversion was done to allow for direct comparison to parametric methods discussed in the literature. Parametric results from other works are also provided in Tables 2 and 3 for ease of comparison between the nonparametric and parametric methods. The parametric results provided in Table 2 come from Bingham et al. (2010); Tables 3 and 4, pages 1324–1325, and the parametric results provided in Table 3 come from Bingham et al. (2009b); Tables 3 and 4, pages 616–617. The parametric results in these works were obtained by inversion of the likelihood ratio test (LRT).
When considering the results for the matrix Fisher distribution provided in Table 2, we see that the coverage rates obtained through bootstrapping fail to reach nominal levels for the smaller sample sizes of n=10 and 30, but do well when n=100. The parametric approach produces coverage rates that are slightly too low (around 91–92 %) when n=10, with coverage rates around 95 % for larger samples. When considering the results for the vMUARS distribution provided in Table 3, we see that the nonparametric rates are only too small when n=10, while the coverage rates for the parametric approach fluctuate around 95 % regardless of sample size. For both the matrix Fisher and vMUARS distribution, the bootstrap intervals are slightly wider than the LRT intervals for small n, but comparable for larger sample sizes. Overall, the parametric methods outperform the nonparametric bootstrap for small sample sizes when the distributional assumptions are met (i.e. the correct distribution is fit to the data it was simulated from), but both perform as desired for larger samples.
Although the parametric methods perform well when the correct distribution is applied, this is not the case when data are modeled incorrectly. In addition to obtaining the bootstrap results, the LRT method of Bingham et al. (2010) was used to fit the matrix Fisher distribution to each of the vMUARS samples. Coverage rates and median widths of the intervals for the AMA are given in Table 4. It can be seen that coverage rates are not near the desired 95 % for any of the cases, with rates as low as 25 % for large n. Only when the data is extremely spread (κ=1) are the rates moderately large. Therefore, bootstrapping far outperforms the parametric methods when distributional assumptions are not met.
In summary, for situations with small sample sizes, parametric methods are preferred to the bootstrap, provided that goodnessoffit is investigated ahead of time to ensure the correct model is being used. See Bingham et al. (2009a) for a way to investigate goodnessoffit for 3D rotations through QQ plots. For larger sample sizes, the nonparametric methods perform just as well as the parametric methods when the correct distribution is used for modeling, with far better performance when an incorrect distribution is applied. Although the extensive simulations reported here do only focus on two distributions for 3D rotations, initial analysis was done using the Cayley distribution of León et al. (2006) and the Wrapped Trivariate Normal distribution of Qiu et al. (2014), with similar findings.
Application to electron backscatter diffraction data
Threedimensional rotation data commonly arise in materials science when exploring the texture of a specimen of some polycrystalline material, such as metal. Through electron backscatter diffraction (EBSD), a fixed beam of electrons is diffracted off of the (crystallographic) lattice planes of the polycrystalline specimen. These images reveal information about texture or crystallographic preferred orientation (Randle 2003). An area of interest in regards to EBSD measurements is precision, as methods used for quantifying EBSD precision in the materials science literature are not standard. Bingham et al. (2009a) study precision in nickel and aluminum specimens by using parametric LRT methods, after showing that the vMUARS distribution provides an adequate fit to the data sets in hand. However, since more of the focus of the paper by Bingham et al. (2009a) was on development of the UARS class than on the EBSD application, the data sets considered only make up a small subset of all EBSD data actually collected. When considering the larger data set of over 4,000 observations from a single scan on the nickel specimen, there are instances in which it is hard to find an adequate distributional fit to some subsets of the data.
When using EBSD, orientations close in proximity are classified as composing a grain when the misorientation angle between them is small, so that a grain is thought of as a homogeneous piece of material that produces observations which generally share a common orientation. Five different grains were isolated from the nickel specimen, giving samples of various sizes. Though crystallographic orientations are not generally a rotation but a coset of SO(3) (i.e., a set of crystalsymmetrically equivalent rotations), the data sets considered here have been preprocessed to eliminate the 24fold ambiguity. It is also worth noting that the rotations will be considered as independent, when in fact there may be spatially induced dependence.
For each of the five samples, attempts were made to find an adequate distributional fit by examining QQ plots. The misorientation angles were extracted from each data set and various circular distributions were fit to the data (i.e. various forms of C(rκ) from (1) were tried). The theoretical quantiles were plotted against the sample misorientation angles and through these QQ plots it did not appear that there were good distributional matches. Figure 2 gives two of the QQ plots for a grain of size n=38. It can be seen that neither the matrix Fisher nor vMUARS distribution seem to fit well to this data. Note that these were just two of several distributions that were explored.
Given that appropriate distributional forms for the five grains in consideration could not be found, it is best to quantify spread in these data sets through nonparametric methods. The bootstrap procedure of Section 3 was used on each of these data sets, resulting in a confidence interval for the AMA. These intervals are reported in both radians and degrees in Table 5. The withingrain precision estimates found here are comparable to the 0.5°−1° range reported in the literature (Demirel et al. 2000; Wilson and Spanos 2001; Bingham et al. 2009a).
Conclusion
While there have been recent advances in distributional theory and parametric inference for 3D rotation data, instances still exist where it may be difficult to find an appropriate model to fit to a data set. As shown in Section 4, coverage rates for confidence intervals for estimating spread can fall below nominal levels when incorrect distributions are used. In such cases, nonparametric methods such as bootstrapping can provide a good alternative. However, for small samples (around n=10), applying the correct parametric procedure is better than using bootstrapping, provided that a good distributional fit can be found.
References
Bingham, MA, Nordman, DJ, Vardeman, SB: Modeling and inference for measured crystal orientations and a tractable class of symmetric distributions for rotations in three dimensions. J. Am. Stat. Assoc. 104, 1385–1397 (2009a).
Bingham, MA, Vardeman, SB, Nordman, DJ: Bayes onesample and oneway random effects analyses for 3D orientations with application to materials science. Bayesian Anal. 4, 607–630 (2009b).
Bingham, MA, Nordman, DJ, Vardeman, SB: Finitesample investigation of likelihood and Bayes inference for the symmetric von MisesFisher distribution. Comput. Stat. Data Anal. 54, 1317–1327 (2010).
Demirel, MC, ElDasher, BS, Adams, BL, Rollett, AD: Studies on the accuracy of electron backscatter diffraction measurements. In: Schwartz, AJ, Mukul, K, Adams, BL (eds.)Electron backscatter diffraction in materials Science. Kluwer Academic/Plenum Publishers, New York (2000).
Downs, TD: Orientation statistics. Biometrika. 59, 665–676 (1972).
Efron, B: Better bootstrap confidence intervals. J. Am. Stat. Assoc. 82, 171–185 (1987).
Fisher, NI, Hall, P: Bootstrap confidence regions for directional data. J. Am. Stat. Assoc. 84, 996–1002 (1989).
Fisher, NI: Statistical analysis of circular data. Cambridge University Press, New York (1996).
Will, LK, Bingham, M: Bootstrap techniques for measures of center for threedimensional rotation data (2015). to appear in Involve.
Haddou, M, Rivest, LP, Pierrynowski, M: A nonlinear mixed effects directional model for the estimation of the rotation axes of the human ankle. Ann. Appl. Stat. 4, 1892–1912 (2010).
Jupp, PE, Mardia, KV: Maximum likelihood estimators for the matrix von MisesFisher and Bingham distributions. Ann. Stat. 7, 599–606 (1979).
Khatri, CG, Mardia, KV: The Von MisesFisher matrix distribution in orientation statistics. J. R. Stat. Soc. Ser. B. 39, 95–106 (1977).
León, CA, Massé, JC, Rivest, LP: A statistical model for random rotations. J. Multivariate Anal. 97, 412–430 (2006).
Mardia, KV, Jupp, PE: Directional Statistics. Wiley, Chichester and New York (2000).
Prentice, MJ: Orientation statistics without parametric assumptions. J. R. Stat. Soc. Ser. B. 48, 214–222 (1986).
Qiu, Y, Nordman, DJ, Vardeman, SB: A wrapped trivariate normal distribution and Bayes inference for 3D rotations. Stat. Sinica. 24, 897–917 (2014).
Rancourt, D, Rivest, LP, Asselin, J: Using orientation statistics to investigate variations in human kinematics. J. R. Stat. Soc. Ser. C. 49, 81–94 (2000).
Randle, V: Microtexture determination and its applications, Maney for the institute of materials. Minerals and Mining, London (2003).
Rivest, LP, Baillargeon, S, Pierrynowski, M: A directional model for the estimation of the rotation axes of the ankle joint. J. Am. Stat. Assoc. 103, 1060–1069 (2008).
Stanfil, B, Genschel, U, Hofmann, H, Nordman, D: Nonparametric confidence regions for the central orientation of random rotations. J. Multivariate Anal. 135, 106–116 (2015).
Wilson, AW, Spanos, G: Application of orientation imaging microscopy to study phase transformations in steels. Mater. Charact. 46, 407–418 (2001).
Acknowledgments
This research was supported by NSF grant DMS1104409.
Author information
Additional information
Competing interests
The author declare that she have no competing interests.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Mathematics Subject Classification (MSC)
 62G09
 62G05
 62G15
Keywords
 3D rotations
 Bootstrapping
 Matrix Fisher distribution
 UARS distributions