# Quantifying spread in three-dimensional rotation data: comparison of nonparametric and parametric techniques

- Melissa A. Bingham
^{1}Email author

**2**:9

https://doi.org/10.1186/s40488-015-0032-x

© Bingham. 2015

**Received: **27 July 2015

**Accepted: **28 September 2015

**Published: **6 October 2015

## Abstract

While there have been recent advances in distributional theory and parametric inference for 3-D rotation data, applications still exist where it is difficult to identify an appropriate distribution for modeling. In these instances, nonparametric inference may be preferred. In this paper, a measure of spread for 3-D rotation data, called the average misorientation angle, is introduced and bootstrapping is developed for this measure. Existing parametric inference methods for estimating spread in 3-D rotations are compared to the bootstrapping procedure through a simulation study. The bootstrapping technique is then used in a materials science application where existing distributions do not appear to provide an adequate fit.

## Keywords

## Mathematics Subject Classification (MSC)

## Introduction

Data in the form of three-dimensional rotations are common in the areas of materials science (e.g., crystal orientations in metals, Demirel et al. 2000; Wilson and Spanos 2001) and human kinematics (Rancourt et al. 2000; Rivest et al. 2008; Haddou et al. 2010). Despite the prevalence of 3-D rotations in such fields, distributional developments for such data have remained rather stagnant over the years. Most works regarding 3-D rotations rely on the matrix Fisher distribution (Khatri and Mardia 1977; Jupp and Mardia 1979; Prentice 1986; Mardia and Jupp 2000; Rancourt et al. 2000) which was introduced by Downs (1972). León et al. (2006) introduced the Cayley distribution for 3-D rotations several years later. Recognizing the limitations of existing distributions for 3-D rotations, Bingham et al. (2009a) developed the Uniform Axis-Random Spin (UARS) class of distributions. While the UARS class provides flexibility in modeling 3-D rotations that was not available before its development, applications may still exist where it is difficult to identify an appropriate member of the UARS class for modeling (see Section 5 for particular examples). In such cases, using nonparametric methods for estimation may be advantageous. This paper focuses specifically on estimation of the spread in 3-D rotations and two distributions for 3-D rotations, the matrix Fisher and von Mises version of the UARS class, are studied primarily. The next section gives an overview of these two distributions.

## Overview of distributions for 3-D rotations

The matrix Fisher distribution is the most commonly used distribution for 3-D rotations (Khatri and Mardia 1977; Jupp and Mardia 1979) and the symmetric version of it is a member of the UARS class developed by Bingham et al. (2009a). Therefore, the following discussion of distributions for 3-D rotations is facilitated through the UARS class.

**O**∈ SO(3), where SO(3) denotes the set of all 3×3 orthogonal rotation matrices, from a UARS distribution with center at matrix

**S**can be written as

**O**=

**S**

**P**, where

**I**

_{3 × 3}, about an axis \(\mathbf {U}=(u_{1},u_{2},u_{3})^{T} \in \mathbb {R}^{3}\) by a random angle

*r*∈(−

*π*,

*π*]. Here

**U**is uniformly distributed on the unit sphere and

*r*follows some circular distribution that is symmetric about 0 with spread depending on parameter

*κ*. The parameter

*κ*is called a concentration parameter, with larger values indicating less spread in the rotations.

*r*∈(−

*π*,

*π*] has density

*C*(

*r*|

*κ*). Then a matrix density for

**O**∼ UARS(

**S**,

*κ*) is given by

with respect to the Haar measure (Bingham et al. 2009a).

*C*(

*r*|

*κ*). For the matrix Fisher distribution,

where *I*
_{
i
} denotes the modified Bessel function of order *i*. For large *κ*, this density is approximately the Maxwell-Boltzmann density with scale parameter \(1/\sqrt {2\kappa }\) (Bingham et al. 2010).

*κ*→

*∞*(Fisher 1996). The von Mises distribution has density

and corresponds to the von Mises version of the UARS class (vM-UARS) studied by Bingham et al. (2009a) and Bingham et al. (2009b).

In the next section, a possible measure of spread for 3-D rotations is introduced and the steps for obtaining a bootstrap confidence interval are outlined. The matrix Fisher and vM-UARS distributions are then used in a simulation study in Section 4 to compare the bootstrapping technique to existing parametric approaches in the literature.

## Quantifying spread: the average misorientation angle and bootstrapping

*κ*. As such, a broadly defined point estimate for spread in 3-D rotations, that can be used regardless of distribution, is needed. Suppose that

**O**

_{1},…,

**O**

_{ n }∈ SO(3). The mean rotation,

**M**, is a commonly used measure of center (Khatri and Mardia 1977; León et al. 2006; Bingham et al. 2009a) which is defined as the rotation that maximizes \(\text {trace} \left (\mathbf {M}^{T} \bar {\mathbf {O}}\right)\), where \(\bar {\mathbf {O}}=\frac {1}{n}\sum _{i=1}^{n} \mathbf {O}_{i}\). The mean rotation

**M**can be found by using

**M**=

**V**

**W**, where \(\bar {\mathbf {O}}=\mathbf {V}\boldsymbol {\Sigma }\mathbf {W}\) is the singular value decomposition of \(\bar {\mathbf {O}}\). Distance between each rotation

**O**

_{ i },

*i*=1,…,

*n*, in a data set and the mean rotation

**M**can be measured by the misorientation angle (in radians) between the two rotations, calculated as

**O**

*i*′ is the transpose of

**O**

_{ i }. This misorientation angle is the smallest angle of rotation needed to get from

**O**

_{ i }to

**M**via a spin about some axis. Now, the overall spread in the data set

**O**

_{1},…,

**O**

_{ n }can be taken to be the average misorientation angle (AMA), \(\frac {1}{n} \sum _{i=1}^{n} \operatorname {mis}(\mathbf {O}_{i},\mathbf {M})\). For illustration purposes, Fig. 1 shows two different 3-D rotation data sets of size

*n*=100 with (a) AMA =0.2013 and (b) AMA =0.0595. Here the 3-D rotations are plotted as points on the sphere, with one observation represented by three points that would correspond to three orthogonal axes.

*C*(

*r*|

*κ*). Since

*C*(

*r*|

*κ*),

*r*∈(−

*π*,

*π*] is symmetric about 0,

*E*(

*r*)=0. However, we can think of all angles,

*r*, as being positive since a spin of

*r*about a vector

**V**is equivalent to a spin of −

*r*about vector −

**V**. Therefore, we can find the population AMA for a particular UARS distribution with given

*κ*and

*C*(

*r*|

*κ*) as

In the instances of *C*(*r*|*κ*) in (2) and (3), this does not have a closed form.

*p*-dimensional unit vectors (Fisher and Hall 1989), its application to 3-D rotation data is limited. Recent works have focused on estimation of the central rotation (Will and Bingham 2015; Stanfil et al. 2015), but no attention has been given to using bootstrapping for estimating spread. The steps for the bootstrapping technique follow.

- 1.
Resample from the 3-D rotation data set

**O**_{1},…,**O**_{ n }with replacement. - 2.
Calculate the AMA for the bootstrap sample obtained in step 1.

- 3.
Repeat steps 1 and 2 a large number (say 1000) times.

After the AMA values are obtained for each bootstrap sample, this set can be used to construct a confidence interval. Under the bootstrap percentile method, a 95 % confidence interval is obtained by using the 2.5 th and 97.5 th percentiles as confidence bounds. Other methods, such as the central percentile method and bias-corrected method (Efron 1987), exist for obtaining confidence intervals from bootstrap samples, but these techniques showed no improvement over the simpler percentile method for the situations considered in the simulations that were conducted. In the next section, a simulation study is used to compare bootstrapping for the AMA to existing parametric methods for quantifying spread.

## Comparison of nonparametric and parametric methods via simulation

*κ*used were 1, 5, 20, and 500, which give a broad range of spread from very concentrated (

*κ*=500) to very spread (

*κ*=1). Although existing works that use parametric methods to quantify spread focus on the parameter

*κ*(Bingham et al. 2009b; Bingham et al., 2010), the bootstrapping considered here uses the AMA. Therefore, we will convert between

*κ*and the AMA. The AMA corresponding to each distribution and choice of

*κ*were found using (5) and are displayed in Table 1.

Population AMA for the matrix Fisher and vM-UARS distributions with the given *κ*

Matrix Fisher | vM-UARS | |
---|---|---|

| 1.407704 | 0.999947 |

| 0.521488 | 0.375360 |

| 0.254209 | 0.180353 |

| 0.050463 | 0.035697 |

**S**from (1) was set at the identity matrix, since the center used does not influence the results when estimating spread. Further, sample sizes of

*n*= 10, 30, and 100 were used. For each distribution and (

*κ*,

*n*) pair, 1000 different data sets were simulated. For each data set, 1000 bootstrap replications were used, resulting in a bootstrap percentile confidence interval for the AMA. The proportion of intervals out of the 1000 that contained the population AMA are given in Tables 2 and 3 as the nonparametric coverage rate. The median width of the 1000 intervals is also given in these tables. In addition to the width being reported in terms of the AMA, the endpoints of each interval were put in terms of

*κ*by solving the function in (5) for

*κ*given the AMA. The median width in terms of

*κ*was then also found. This conversion was done to allow for direct comparison to parametric methods discussed in the literature. Parametric results from other works are also provided in Tables 2 and 3 for ease of comparison between the nonparametric and parametric methods. The parametric results provided in Table 2 come from Bingham et al. (2010); Tables 3 and 4, pages 1324–1325, and the parametric results provided in Table 3 come from Bingham et al. (2009b); Tables 3 and 4, pages 616–617. The parametric results in these works were obtained by inversion of the likelihood ratio test (LRT).

Coverage rates and median interval widths for estimating spread of the matrix Fisher distribution via nonparametric and parametric methods using various (*κ*,*n*) pairs

Nonparametric | Parametric | ||||
---|---|---|---|---|---|

( | Coverage rate | Median width (AMA) | Median width ( | Coverage rate | Median width ( |

(1,10) | 78.3 | 0.7061 | 1.3685 | 92.1 | 1.1621 |

(1,30) | 89.5 | 0.4663 | 0.6709 | 95.0 | 0.6428 |

(1,100) | 93.7 | 0.2635 | 0.3528 | 94.9 | 0.3475 |

(5,10) | 82.2 | 0.2594 | 8.3839 | 91.1 | 5.6748 |

(5,30) | 91.5 | 0.1584 | 3.3620 | 94.4 | 2.9355 |

(5,100) | 93.2 | 0.0882 | 1.6469 | 94.9 | 1.5676 |

(20,10) | 84.2 | 0.1257 | 33.4181 | 90.8 | 23.4370 |

(20,30) | 89.7 | 0.0756 | 13.8188 | 95.1 | 12.0597 |

(20,100) | 94.2 | 0.0419 | 6.7897 | 94.2 | 6.4390 |

(500,10) | 83.7 | 0.0251 | 871.3614 | 92.2 | 591.3957 |

(500,30) | 92.6 | 0.0151 | 346.8934 | 95.8 | 306.2049 |

(500,100) | 94.2 | 0.0083 | 171.3630 | 95.0 | 162.3764 |

Coverage rates and median interval widths for estimating spread of the vM-UARS distribution via nonparametric and parametric methods using various (*κ*,*n*) pairs

Nonparametric | Parametric | ||||
---|---|---|---|---|---|

( | Coverage rate | Median width (AMA) | Median width ( | Coverage rate | Median width ( |

(1,10) | 87.6 | 0.7838 | 2.421 | 94.8 | 2.274 |

(1,30) | 93.9 | 0.5299 | 1.259 | 94.8 | 1.231 |

(1,100) | 94.4 | 0.2993 | 0.669 | 95.5 | 0.665 |

(5,10) | 85.3 | 0.2885 | 12.686 | 94.4 | 9.940 |

(5,30) | 94.5 | 0.1987 | 5.452 | 94.9 | 4.979 |

(5,100) | 95.7 | 0.1139 | 2.7565 | 95.3 | 2.599 |

(20,10) | 87.4 | 0.1379 | 54.665 | 94.0 | 42.09 |

(20,30) | 93.7 | 0.0939 | 22.913 | 95.5 | 21.36 |

(20,100) | 94.1 | 0.0532 | 11.543 | 94.6 | 11.22 |

(500,10) | 88.7 | 0.0278 | 1365.507 | 92.6 | 1075.0 |

(500,30) | 94.0 | 0.0183 | 598.926 | 95.2 | 536.4 |

(500,100) | 94.9 | 0.0104 | 299.603 | 94.2 | 283.3 |

Coverage rates and median interval widths for the AMA when fitting the matrix Fisher distribution to vM-UARS data

( | Coverage rate | Median width (AMA) |
---|---|---|

(1,10) | 83.8 | 0.6305 |

(1,30) | 82.8 | 0.3730 |

(1,100) | 71.9 | 0.2072 |

(5,10) | 67.3 | 0.2178 |

(5,30) | 55.6 | 0.1285 |

(5,100) | 25.5 | 0.0706 |

(20,10) | 69.1 | 0.0992 |

(20,30) | 54.7 | 0.0602 |

(20,100) | 24.3 | 0.0333 |

(500,10) | 66.2 | 0.0203 |

(500,30) | 53.1 | 0.0118 |

(500,100) | 21.4 | 0.0066 |

When considering the results for the matrix Fisher distribution provided in Table 2, we see that the coverage rates obtained through bootstrapping fail to reach nominal levels for the smaller sample sizes of *n*=10 and 30, but do well when *n*=100. The parametric approach produces coverage rates that are slightly too low (around 91–92 %) when *n*=10, with coverage rates around 95 % for larger samples. When considering the results for the vM-UARS distribution provided in Table 3, we see that the nonparametric rates are only too small when *n*=10, while the coverage rates for the parametric approach fluctuate around 95 % regardless of sample size. For both the matrix Fisher and vM-UARS distribution, the bootstrap intervals are slightly wider than the LRT intervals for small *n*, but comparable for larger sample sizes. Overall, the parametric methods outperform the nonparametric bootstrap for small sample sizes when the distributional assumptions are met (i.e. the correct distribution is fit to the data it was simulated from), but both perform as desired for larger samples.

Although the parametric methods perform well when the correct distribution is applied, this is not the case when data are modeled incorrectly. In addition to obtaining the bootstrap results, the LRT method of Bingham et al. (2010) was used to fit the matrix Fisher distribution to each of the vM-UARS samples. Coverage rates and median widths of the intervals for the AMA are given in Table 4. It can be seen that coverage rates are not near the desired 95 % for any of the cases, with rates as low as 25 % for large *n*. Only when the data is extremely spread (*κ*=1) are the rates moderately large. Therefore, bootstrapping far outperforms the parametric methods when distributional assumptions are not met.

In summary, for situations with small sample sizes, parametric methods are preferred to the bootstrap, provided that goodness-of-fit is investigated ahead of time to ensure the correct model is being used. See Bingham et al. (2009a) for a way to investigate goodness-of-fit for 3-D rotations through Q-Q plots. For larger sample sizes, the nonparametric methods perform just as well as the parametric methods when the correct distribution is used for modeling, with far better performance when an incorrect distribution is applied. Although the extensive simulations reported here do only focus on two distributions for 3-D rotations, initial analysis was done using the Cayley distribution of León et al. (2006) and the Wrapped Trivariate Normal distribution of Qiu et al. (2014), with similar findings.

## Application to electron backscatter diffraction data

Three-dimensional rotation data commonly arise in materials science when exploring the texture of a specimen of some polycrystalline material, such as metal. Through electron backscatter diffraction (EBSD), a fixed beam of electrons is diffracted off of the (crystallographic) lattice planes of the polycrystalline specimen. These images reveal information about texture or crystallographic preferred orientation (Randle 2003). An area of interest in regards to EBSD measurements is precision, as methods used for quantifying EBSD precision in the materials science literature are not standard. Bingham et al. (2009a) study precision in nickel and aluminum specimens by using parametric LRT methods, after showing that the vM-UARS distribution provides an adequate fit to the data sets in hand. However, since more of the focus of the paper by Bingham et al. (2009a) was on development of the UARS class than on the EBSD application, the data sets considered only make up a small subset of all EBSD data actually collected. When considering the larger data set of over 4,000 observations from a single scan on the nickel specimen, there are instances in which it is hard to find an adequate distributional fit to some subsets of the data.

When using EBSD, orientations close in proximity are classified as composing a grain when the misorientation angle between them is small, so that a grain is thought of as a homogeneous piece of material that produces observations which generally share a common orientation. Five different grains were isolated from the nickel specimen, giving samples of various sizes. Though crystallographic orientations are not generally a rotation but a coset of SO(3) (i.e., a set of crystal-symmetrically equivalent rotations), the data sets considered here have been preprocessed to eliminate the 24-fold ambiguity. It is also worth noting that the rotations will be considered as independent, when in fact there may be spatially induced dependence.

*C*(

*r*|

*κ*) from (1) were tried). The theoretical quantiles were plotted against the sample misorientation angles and through these Q-Q plots it did not appear that there were good distributional matches. Figure 2 gives two of the Q-Q plots for a grain of size

*n*=38. It can be seen that neither the matrix Fisher nor vM-UARS distribution seem to fit well to this data. Note that these were just two of several distributions that were explored.

Confidence intervals for the AMA in both radians and degrees for five nickel grains of various sizes

Grain | Sample size | CI for AMA (radians) | CI for AMA (degrees) |
---|---|---|---|

1 | 38 | (0.00638, 0.00796) | (0.36555, 0.45607) |

2 | 47 | (0.00640, 0.00845) | (0.36669, 0.48415) |

3 | 42 | (0.00741, 0.01328) | (0.42456, 0.76089) |

4 | 34 | (0.00547, 0.01298) | (0.31341, 0.74369) |

5 | 27 | (0.00886, 0.01595) | (0.50764, 0.91387) |

## Conclusion

While there have been recent advances in distributional theory and parametric inference for 3-D rotation data, instances still exist where it may be difficult to find an appropriate model to fit to a data set. As shown in Section 4, coverage rates for confidence intervals for estimating spread can fall below nominal levels when incorrect distributions are used. In such cases, nonparametric methods such as bootstrapping can provide a good alternative. However, for small samples (around *n*=10), applying the correct parametric procedure is better than using bootstrapping, provided that a good distributional fit can be found.

## Declarations

### Acknowledgments

This research was supported by NSF grant DMS-1104409.

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- Bingham, MA, Nordman, DJ, Vardeman, SB: Modeling and inference for measured crystal orientations and a tractable class of symmetric distributions for rotations in three dimensions. J. Am. Stat. Assoc. 104, 1385–1397 (2009a).Google Scholar
- Bingham, MA, Vardeman, SB, Nordman, DJ: Bayes one-sample and one-way random effects analyses for 3-D orientations with application to materials science. Bayesian Anal. 4, 607–630 (2009b).Google Scholar
- Bingham, MA, Nordman, DJ, Vardeman, SB: Finite-sample investigation of likelihood and Bayes inference for the symmetric von Mises-Fisher distribution. Comput. Stat. Data Anal. 54, 1317–1327 (2010).MATHMathSciNetView ArticleGoogle Scholar
- Demirel, MC, El-Dasher, BS, Adams, BL, Rollett, AD: Studies on the accuracy of electron backscatter diffraction measurements. In: Schwartz, AJ, Mukul, K, Adams, BL (eds.)
*Electron backscatter diffraction in materials Science*. Kluwer Academic/Plenum Publishers, New York (2000).Google Scholar - Downs, TD: Orientation statistics. Biometrika. 59, 665–676 (1972).MATHMathSciNetView ArticleGoogle Scholar
- Efron, B: Better bootstrap confidence intervals. J. Am. Stat. Assoc. 82, 171–185 (1987).MATHMathSciNetView ArticleGoogle Scholar
- Fisher, NI, Hall, P: Bootstrap confidence regions for directional data. J. Am. Stat. Assoc. 84, 996–1002 (1989).MathSciNetView ArticleGoogle Scholar
- Fisher, NI: Statistical analysis of circular data. Cambridge University Press, New York (1996).Google Scholar
- Will, LK, Bingham, M: Bootstrap techniques for measures of center for three-dimensional rotation data (2015). to appear in Involve.Google Scholar
- Haddou, M, Rivest, L-P, Pierrynowski, M: A nonlinear mixed effects directional model for the estimation of the rotation axes of the human ankle. Ann. Appl. Stat. 4, 1892–1912 (2010).MATHMathSciNetView ArticleGoogle Scholar
- Jupp, PE, Mardia, KV: Maximum likelihood estimators for the matrix von Mises-Fisher and Bingham distributions. Ann. Stat. 7, 599–606 (1979).MATHMathSciNetView ArticleGoogle Scholar
- Khatri, CG, Mardia, KV: The Von Mises-Fisher matrix distribution in orientation statistics. J. R. Stat. Soc. Ser. B. 39, 95–106 (1977).MATHMathSciNetGoogle Scholar
- León, CA, Massé, J-C, Rivest, L-P: A statistical model for random rotations. J. Multivariate Anal. 97, 412–430 (2006).MATHMathSciNetView ArticleGoogle Scholar
- Mardia, KV, Jupp, PE: Directional Statistics. Wiley, Chichester and New York (2000).MATHGoogle Scholar
- Prentice, MJ: Orientation statistics without parametric assumptions. J. R. Stat. Soc. Ser. B. 48, 214–222 (1986).MATHMathSciNetGoogle Scholar
- Qiu, Y, Nordman, DJ, Vardeman, SB: A wrapped trivariate normal distribution and Bayes inference for 3-D rotations. Stat. Sinica. 24, 897–917 (2014).MATHMathSciNetGoogle Scholar
- Rancourt, D, Rivest, L-P, Asselin, J: Using orientation statistics to investigate variations in human kinematics. J. R. Stat. Soc. Ser. C. 49, 81–94 (2000).MATHMathSciNetView ArticleGoogle Scholar
- Randle, V: Microtexture determination and its applications, Maney for the institute of materials. Minerals and Mining, London (2003).Google Scholar
- Rivest, L-P, Baillargeon, S, Pierrynowski, M: A directional model for the estimation of the rotation axes of the ankle joint. J. Am. Stat. Assoc. 103, 1060–1069 (2008).MATHMathSciNetView ArticleGoogle Scholar
- Stanfil, B, Genschel, U, Hofmann, H, Nordman, D: Nonparametric confidence regions for the central orientation of random rotations. J. Multivariate Anal. 135, 106–116 (2015).MathSciNetView ArticleGoogle Scholar
- Wilson, AW, Spanos, G: Application of orientation imaging microscopy to study phase transformations in steels. Mater. Charact. 46, 407–418 (2001).View ArticleGoogle Scholar