Normal approximation
The normal approximation is one of the most popular for approximating the distribution of the number of runs or patterns X
n
(Λ) in Statistics. In general, when Λ is simple or compound, the trials are i.i.d., and the counting is non-overlapping, by appealing to (1) and renewal arguments, it has been shown that X
n
(Λ) is asymptotically normally distributed (cf. Fu and Lou 2007; Karlin and Taylor 1975). The form of the approximation is
(6)
where Φ(·) denotes the standard normal distribution function and μ
W
and are the mean and variance of W(Λ) respectively, which are given by
(7)
(8)
Given a pattern Λ, it is well known that the mean μ
W
and the variance are difficult to obtain via combinatoric arguments, especially when Λ is a compound pattern or the trials are Markov dependent. For example, as pointed out in Karlin (2005) and Kleffe and Borodovski (1992), approximate values of μ
W
and must sometimes be used. Since W(Λ) is finite Markov chain imbeddeble, (7) and (8), provide the exact values.
The limit in (6) is appropriate when the sequence of inter arrival times {W
i
(Λ)} are i.i.d., which is the case for simple and compound patterns when the {X
i
} are i.i.d. and counting is non-overlapping. When occurrences of Λ correspond to a delayed renewal process, which can occur for Markov dependent trials and/or overlapping counting, we could use the mean and variance of W2(Λ) for the normalizing constants, which are easily obtained by modifying ξ0 in (7) and (8). Even more general cases can be handled by making use of a functional central limit theorem for Markov chains (see, for example, (Meyn and Tweedie 1993, §17.4) and (Asmussen 2003, Theorem 7.2, pg. 30) for the details).
Poisson and compound poisson approximations
It is well known that, in a sequence of Bernoulli (p) trials, if n p→λ as n→∞, then the probability of k successes in n trials can be approximated by a Poisson probability with parameter λ, denoted . This idea has been extended to certain patterns Λ and, under certain conditions, the distribution of X
n
(Λ) can be approximated by a Poisson distribution with parameter μ
n
in the sense that
(9)
where denotes the distribution (law) of a random variable and dTV(·,·) denotes the total variation distance.
The primary tool used to obtain μ
n
and the bound ε
n
is the Stein-Chen method (Chen 1975), and this method has been refined by various authors Arratia et al. (1990), Barbour and Eagleson (1983), Barbour and Eagleson (1984), Barbour and Eagleson (1987), Barbour and Hall (1984), Godbole (1990a), Godbole (1990b), Godbole (1991), Godbole and Schaffner (1993), and Holst et al. (1988). This method has also been extended to compound Poisson approximations for the distributions of runs and patterns and Barbour and Chryssaphinou (2001) provides an excellent theoretical review of these approximations.
In practice, or the expectation of a closely related run statistic is used (cf. Balakrishnan and Koutras 2002, §5.2.3) so that, in the former case,
(10)
Finding and the bound ε
n
is usually done on a case by case basis. For the mathematical details, the books (Barbour et al. 1992a) and (Balakrishnan and Koutras 2002) are recommended.
Let denote the compound Poisson distribution, that is, the distribution of the random variable where the random variable M has a Poisson distribution with parameter λ and the Y
j
are i.i.d. having distribution ν. A compound Poisson distribution for approximating nonnegative random variables was suggested in Barbour et al. (1992b) (see also Barbour et al. (19951996)). The approximation is formulated similarly to the Poisson approximation:
(11)
The distribution of Nn,k, the number of non-overlapping occurrences of k consecutive successes in n i.i.d. Bernoulli trials, is one of the most important in this area and one of the most studied in the literature. Reversing the roles of S (success) and F (failure), the reliability of consecutive-k-out-of-n system, denoted C(k,n : F), is given by . Even in this simple case (i.e. Λ=S S⋯S), there are several ways to apply the Poisson approximation techniques. For example, (Godbole 1991, Theorem 2) shows that approximating Nn,k with a distribution works well if certain conditions hold. Godbole and Schaffner (Godbole and Schaffner 1993, pg. 340) suggests an improved Poisson approximation for word patterns.
The primary difficulty in applying the Poisson approximation is the determination of the optimal parameter μ
n
, which is higly dependent on the structure of the pattern Λ. In particular, if Λ is long and has several uneven overlapping sub-patterns, then finding μ
n
by their method can be very tedious. In the sequel, we show that even the (asymptotic) best choice for μ
n
for Poisson approximations does not perform well in the relative sense.
FMCI approximations
Approximations based on the FMCI approach depend on the spectral decomposition of the essential transition probability matrix N.
Let N be a w×w essential transition probability matrix associated with a finite Markov chain {Y
n
:n≥0} corresponding to the distribution of the waiting time W(Λ). Let 1>λ1≥|λ2|≥⋯≥|λ
w
| denote the ordered eigenvalues of N, repeated according to their algebraic multiplicities, with associated (right) eigenvectors . When the geometric multiplicity of λ
i
is less than its algebraic multiplicity, we will use vectors of 0’s for the unspecified eigenvectors. The fact that λ1 can be taken as a positive real number and that η1 can be taken to be non-negative are consequences of the Perron-Frobenious Theorem for non-negative matrices ( Seneta cf.1981).
Definition 1
We will say that {Y
n
:n≥0}, or equivalently, N, satisfies the FMCI Approximation Conditions if
-
(i)
there exists constants a 1,…,a
w
such that
(12)
-
(ii)
λ 1 has algebraic multiplicity g and λ 1>|λ
j
| for all j>g.
Verifying these conditions is usually straightforward. They certainly hold if N is irreducible and aperiodic, but also hold in many other cases as well. For example, (12) requires only that 1′ is in the linear space spanned by , which can hold even when N is defective (not diagonizable). Condition (ii) requires that the communication classes corresponding λ1 are aperiodic. That is, if Ψ is a communication class and N[Ψ] corresponds to the substocastic matrix N restricted to the states in Ψ, with largest eigenvalue λ1[Ψ], then all Ψ such that λ1[Ψ]=λ1 should be aperiodic. We also mention that the algebraic multiplicity of λ1 is the number of communication classes Ψ such that λ1[Ψ]=λ1.
Fu and Johnson (2009) give the following theorem.
Theorem 1
Let {X
i
} be a sequence of i.i.d. trials taking values in
, let Λ be a simple pattern of length ℓ with d×d essential transition probability matrix N and let X
n
(Λ) be the number of non-overlapping occurrences of Λ in {X
i
}. If N satisfies the FMCI approximation conditions then, for any fixed x≥0,
(13)
where. If g=1, as is usually the case, then a=a1(ξ0η 1′).
Given a pattern Λ, the approximation in (13) requires finding the Markov chain imbedding associated with the waiting time W(Λ), the essential transition probability matrix N as well as its eigenvalues and associated eigenvectors. Usually, these steps are rather simple and can be easily automated together with (13). Even for very large n and large ℓ, say n=1,000,000 and ℓ=50, the CPU time is negligible. Fu and Johnson (2009) also provide details on extending these results to compound patterns, overlapping counting and Markov dependent trials.
For the purpose of comparing these approximations, we prefer to write (13) as
(14)
Note that the approximation havs three parts: a constant part; a polynomial in n of degree x; and a third (dominant) part which converges to 0 exponentially fast as n→∞.
More precisely, the FMCI approximation in (13) may be written as
(15)
Since |λg+1|<λ1, the term |λg+1/λ1|n/(x+1)−ℓ tends to 0 exponentially as n→∞ and hence is negligible if n/(x+1)−ℓ is moderate or large (say ≥50).
Large deviation approximation
Fu et al. (2012) provide the following large deviation approximation for right-tail probabilities for the number of non-overlapping occurrences for simple patterns Λ. The reasons for providing only the right-tail large deviation approximation are (i) all of the above mentioned approximations fail to approximate the extreme right-tail probabilities and (ii) the FMCI approximation provides an accurate approximation for left-tail probabilities.
Theorem 2
Let
and let
(16)
be the moment generating function of W(Λ). Then
(17)
where
(18)
, τ is the solution to h′(ε,τ)=0, and
(19)