Joint distribution of k-tuple statistics in zero-one sequences of Markov-dependent trials
- Anastasios N. Arapis^{1},
- Frosso S. Makri^{1}Email author and
- Zaharias M. Psillakis^{2}
https://doi.org/10.1186/s40488-017-0080-5
© The Author(s) 2017
Received: 29 March 2017
Accepted: 18 October 2017
Published: 15 November 2017
Abstract
We consider a sequence of n, n≥3, zero (0) - one (1) Markov-dependent trials. We focus on k-tuples of 1s; i.e. runs of 1s of length at least equal to a fixed integer number k, 1≤k≤n. The statistics denoting the number of k-tuples of 1s, the number of 1s in them and the distance between the first and the last k-tuple of 1s in the sequence, are defined. The work provides, in a closed form, the exact conditional joint distribution of these statistics given that the number of k-tuples of 1s in the sequence is at least two. The case of independent and identical 0−1 trials is also covered in the study. A numerical example illustrates further the theoretical results.
Keywords
AMS Subject Classification
Introduction
Run counting statistics defined on a sequence of binary (zero (0) - one (1)) random variables (RVs), along with their exact and approximate distributions, have been extensively studied in the literature. Their popularity is due to the fact that such statistics appear as useful theoretical models in many research areas including statistics (e.g. hypothesis testing), engineering (e.g. system reliability and quality control), biology (e.g. population genetics and DNA sequence analysis), computer science (e.g. encoding/decoding/transmission of digital information) and financial engineering (e.g. insurance and risk analysis).
In such applications, a key point is the understanding how 1s and 0s are distributed and combined as elements of a 0−1 sequence (finite or infinite, memoryless or not) and eventually forming runs of 1s or 0s which are enumerated according to certain counting schemes. Each scheme defines how runs of same symbols or strings (patterns) of both symbols are formed and consequently are enumerated. A counting scheme may depend on, among other considerations, whether overlapping counting is allowed or not as well as if the counting starts or not from scratch when a run/string of a certain size has been so far enumerated.
The counting scheme as well as the intrinsic uncertainty of a 0−1 sequence are often suggested by the applications. Probabilistic models, in common use, for the internal structure of a 0−1 sequence include the model of a sequence with elements independent of each other or a model for which it is assumed some kind of dependence among the elements of it. The methods used to derive exact/approximating, marginal/joint probability distributions include combinatorial analysis, generating functions, finite Markov chain imbedding technique, recursive schemes as well as normal, Poisson and large deviation approximations.
For extensive reviews of the recent literature on the distribution theory of runs and patterns we refer to Balakrishnan and Koutras (2002) and Fu and Lou (2003). Current works on the subject include, among others, those of Antzoulakos and Chadjiconstantinidis (2001); Eryilmaz (2006, 2015, 2016, 2017); Eryilmaz and Yalcin (2011); Johnson and Fu (2014); Koutras (2003); Koutras et al. (2016); Makri and Psillakis (2015); Makri et al. (2013) and Mytalas and Zazanis (2013, 2014).
In this article we derive expressions for a conditional distribution of a trivariate statistic. Its components denote the number of runs of 1s of length exceeding a fixed threshold number, the number of 1s in such runs of 1s and the length of the minimum sequence’s segment in which these runs are concentrated. The study is developed on a sequence of two-state (0−1) Markov-dependent trials. The runs are enumerated according to Mood’s (1940) counting scheme.
More specifically, the manuscript is organized as follows. In Section 2 we present some preliminary material, including notation and definitions, necessary to develop our results which are obtained in Section 4. In Section 3 we give a motivation along with a statement of the aim of the work. A numerical example, showed in Section 5, clarifies the theoretical results of Section 4. A discussion on the results as well as a note on a future work are given in Section 6.
Throughout the article, for integers, n, m, \({n\choose m}\) denotes the extended binomial coefficient (see, Feller (1968), pp. 50, 63), ⌊x⌋ stands for the greatest integer less than or equal to x and δ _{ ij } denotes the Kronecker delta fuction of the integer arguments i and j. Further, for α>β, we apply the conventions \(\sum _{i=\alpha }^{\beta }y_{i}=0\), \(\prod _{i=\alpha }^{\beta }y_{i}=1\), \(\sum _{i=\alpha }^{\beta }\mathbf {Y}^{(i)}=\mathbf {O}\equiv {\scriptsize \left (\begin {array}{cc} 0 &0\\ 0 & 0 \end {array}\right)}\), \(\prod _{i=\alpha }^{\beta }\mathbf {Y}^{(i)}=\mathbf {I}\equiv {\scriptsize \left (\begin {array}{cc} 1 &0\\ 0 & 1 \end {array}\right)}\), where y _{ i } and Y ^{(i)} are scalars and 2×2 matrices, respectively.
Preliminaries
2.1 Run counting statistics
Let \(\{X_{t}\}_{t=1}^{n}\), n≥1, be the first n trials of a binary (0−1) sequence of RVs, X _{ t }=x _{ t }∈{0,1}. A run of 1s, is a (sub)sequence of \(\{X_{t}\}_{t=1}^{n}\) consisting of consecutive 1s, the number of which is referred to as its length, preceded and succeeded by 0s or by nothing.
Given a fixed integer k, 1≤k≤n, a k-tuple of 1s is a run of 1s of length k or more. In the paper we will deal with the following statistics defined on a \(0-1 \{X_{t}\}_{t=1}^{n}\). For details see, e.g. Makri et al. (2015) and the references therein.
Readily, k G _{ n,k }≤S _{ n,k }.
Readily L _{ n }<k iff G _{ n,k }<1.
(V) For G _{ n,k }≥1, 1≤k≤n, set V _{ n,k }=(D _{ n,k },G _{ n,k },S _{ n,k }). This is the RV we focus on in the article.
Example: By way of illustration consider the trials 1110001100010001010011101111001001001001 numbered from 1 to 40. Then, L _{40}=4 and V _{40,1}=(40,11,19), V _{40,2}=(28,4,12), V _{40,3}=(28,3,10), V _{40,4}=(4,1,4).
2.2 Internal structure’s models
A general enough model for the internal structure of a \(0-1 \{X_{t}\}_{t=1}^{n}\), n≥2, is that of the first n trials of a homogeneous 0−1 Markov chain of first order (HMC1). On such a model we will develop our results. Accordingly, we next state the necessary notation/definitions.
where \(\mathbf {e}_{i}^{'}\) is the transpose (i.e. the column vector) of the row vector e _{ i }, \(i\in {\mathcal {A}}\), with e _{0}=(1,0) and e _{1}=(0,1).
2.3 A combinatorial result
represents the number of allocations of α indistinguishable balls into r distinguishable cells where each of the m, 0≤m≤r, specified cells is occupied by at most k balls. Equivalently, it gives the number of nonnegative integer solutions of the linear equation x _{1}+x _{2}+…+x _{ r }=α with the restrictions, for m≥1, \(0\leq x_{i_{j}}\leq k\), 1≤j≤m, for some specific m-combination {i _{1},i _{2},…,i _{ m }} of {1,2,…,r}, and no restrictions on x _{ j }s, 1≤j≤r, for m=0.
Motivation and aim of the work
In a study of a 0−1 sequence \(\{X_{t}\}_{t=1}^{n}\), n≥3, it is reasonable for one to be interested in the probabilistic behavior of RV V _{ n,k }=(D _{ n,k },G _{ n,k },S _{ n,k }). This happens because jointly its components provide a more refined view of the internal clustering structure of the sequence than the information extracted by each one alone.
Interpreting a k-tuple of 1s as a cluster of consecutive 1s of size at least k, D _{ n,k } represents the size of the minimum segment of \(\{X_{t}\}_{t=1}^{n}\) in which G _{ n,k } clusters of size at least k and at most L _{ n } are concentrated. The overall density of G _{ n,k } clusters, with respect to the number of 1s in them, as well as of the minimum concentration segment is evaluated by S _{ n,k }. Large values of D _{ n,k } suggest that these G _{ n,k } clusters spread over the interval between the left and the right side of the sequence whereas small values of D _{ n,k } indicate rather that the clusters are concentrated in a segment of the sequence of small size leaving the rest part(s) of the sequence empty of such clusters.
In addition to this information, a large value of S _{ n,k } paired with a small value of G _{ n,k } indicates the existence of clusters of 1s of a large size and therefore a trend whereas the same value of S _{ n,k } paired with a large value of G _{ n,k } indicates rather a distribution of clusters of small size in the (sub)sequence in which they are concentrated.
Therefore, based on the former interpretation, the motivation for the study as well as the usefulness of the statistic V _{ n,k }=(D _{ n,k },G _{ n,k },S _{ n,k }) is apparent. In the sequel, we assume that G _{ n,k }≥2 in order to have at least two k-tuples of 1s in the sequence and accordingly the distance D _{ n,k } is not a degenerate one. Moreover, this assumption is a common one in an application area of D _{ n,k }; e.g., in detecting pattern (tandem or non-tandem direct) repeats in DNA sequences (Benson 1999).
The paper provides exact closed form expressions for α _{ n,k }, h _{ n,k }(d,m,s) and eventually for v _{ n,k }(d,m,s) when V _{ n,k } is defined on a 0−1 HMC1/IID. The expressions are obtained via combinatorial analysis.
More specifically, closed formulae are established for the first time for h _{ n,k }(d,m,s), 1≤k≤⌊(n−1)/2⌋, when V _{ n,k } is defined on a 0−1 HMC1 with given P and p ^{(1)}. Since, the general frame of HMC1 covers as a particular case IID sequences, the so implied expressions for v _{ n,k }(d,m,s) are alternative to those obtained for v _{ n,k }(d,m,s), 1≤k≤⌊(n−1)/2⌋, by Makri et al. (2015) for IID sequences.
hence, the work provides closed form expressions for determining f _{ n,k }(d) for HMC1 and IID \(0-1 \{X_{t}\}_{t=1}^{n}\). These expressions are alternative to those derived, for IID sequences, by Makri et al. (2015) for 1≤k≤⌊(n−1)/2⌋ as well as to those obtained, for HMC1, by Arapis et al. (2016) for k=1 and by Arapis et al. (2017) for 1≤k≤⌊(n−1)/2⌋.
Results
for 2−(i+j)≤y≤n−(i+j), 1−δ _{ y,0}−δ _{ y,n }+δ _{ i+j,2}≤r≤ min{n−y,y−1+i+j} and \(\pi _{n}^{(i,j)}(y,r)=0\), otherwise.
Theorem 1
where ε _{ n }(d)=1, if n=d; \(p_{00}^{n-d-2}\left \{p_{10}p_{00}+p_{0}^{(1)}(p_{1}^{(1)})^{-1}p_{01}\left [(n-d-1)p_{10}+p_{00}\right ]\right \}\), if n≥d+1.
Proof
We use similar reasoning for the rest cases. Then summing with respect to i we get the result. □
For a sequence \(\{X_{t}\}_{t=1}^{n}\) of 0−1 IID RVs, h _{ n,1}(d,m,s) reduces to the explicit formula given in the next Corollary.
Corollary 1
In order to derive for HMC1, in the forthcoming Theorem 2, h _{ n,k }(d,m,s), 5≤2k+1≤n, we next recall, in Lemma 1, a result from (Makri et al.: On the concentration of runs of ones of length exceeding a threshold in a Markov chain, submitted).
Lemma 1
For (i,j)∈{0,1}^{2}, n≥2, set \(\lambda _{n,k}^{(i,j)}(x)=P(G_{n,k}=x,X_{1}=i,X_{n}=j)\), x=0,1. Then, it holds that:
Theorem 2
Proof
- (a)
y _{1}+y _{2}+…+y _{ r−1}=y, y _{ j }≥1, 1≤j≤r−1.
- (b)
\(\phantom {\dot {i}\!}z_{1}+z_{i_{1}}+z_{i_{2}}+\ldots +z_{i_{m-2}}+z_{r}=s\), z _{ j }≥k, j∈{1,i _{1},i _{2},…,i _{ m−2},r}, for some specific combination {1,i _{1},i _{2},…,i _{ m−2},r} of {1,2,…,r−1,r} among the \({r-2\choose m-2}\) ones.
- (c)
\(z_{i_{m-1}}+z_{i_{m}}+\ldots +z_{i_{r-2}}=d-y-s\), \(1\leq z_{i_{j}}\leq k-1\), m−1≤j≤r−2, for {i _{ m−1},…,i _{ r−2}}∈{1,2,…,r}−{1,i _{1},i _{2},…,i _{ m−2},r}.
By similar reasoning we get the remaining cases of i, i.e. 1≤i≤k+1 and n−d+1−k≤i≤n−d+1. Then summing with respect to i, y and r we get the result. □
Having found h _{ n,k }(d,m,s), we next proceed to obtain v _{ n,k }(d,m,s). In accomplishing it, the required probabilities α _{ n,k } for HMC1 are recalled, in Lemma 2, from Arapis et al. (2016) for k=1, and they are computed via Lemma 1 for 2≤k≤⌊(n−1)/2⌋.
Lemma 2
Theorem 3
where α _{ n,k } and h _{ n,k }(d,m,s) are provided by Lemma 2 and Theorems 1 (for k=1) and 2 (for 2≤k≤⌊(n−1)/2⌋), respectively.
A numerical example
0−1 IID sequence with p _{1}=0.5
s | m | d=3 | d=4 | d=5 | d=6 | d=7 | d=8 |
---|---|---|---|---|---|---|---|
v _{8,1}(d,m,s) | |||||||
2 | 2 | 0.02739726 | 0.02283105 | 0.01826484 | 0.01369863 | 0.00913242 | 0.00456621 |
3 | 2 | 0.04566210 | 0.03652968 | 0.02739726 | 0.01826484 | 0.00913242 | |
3 | 0.01826484 | 0.02739726 | 0.02739726 | 0.01826484 | |||
4 | 2 | 0.05479452 | 0.04109589 | 0.02739726 | 0.01369863 | ||
3 | 0.04109589 | 0.05479452 | 0.04109589 | ||||
4 | 0.00913242 | 0.01369863 | |||||
5 | 2 | 0.05479452 | 0.03652968 | 0.01826484 | |||
3 | 0.05479452 | 0.05479452 | |||||
4 | 0.01826484 | ||||||
6 | 2 | 0.04566210 | 0.02283105 | ||||
3 | 0.04566210 | ||||||
7 | 2 | 0.02739726 | |||||
f _{8,1}(d) | 0.02739726 | 0.06849315 | 0.12785388 | 0.20547945 | 0.28310503 | 0.28767123 | |
v _{8,2}(d,m,s) | |||||||
4 | 2 | 0.18518519 | 0.09259259 | 0.07407407 | 0.05555556 | ||
5 | 2 | 0.18518519 | 0.07407407 | 0.07407407 | |||
6 | 2 | 0.11111111 | 0.05555556 | ||||
3 | 0.01851852 | ||||||
7 | 2 | 0.07407407 | |||||
f _{8,2}(d) | 0.18518519 | 0.27777778 | 0.25925925 | 0.27777778 | |||
v _{8,3}(d,m,s) | |||||||
6 | 2 | 0.40000000 | 0.20000000 | ||||
7 | 2 | 0.40000000 | |||||
f _{8,3}(d) | 0.40000000 | 0.60000000 |
0−1 HMC1 sequence with p _{00}=p _{11}=0.9, \(p_{1}^{(1)}=0.5\)
s | m | d=3 | d=4 | d=5 | d=6 | d=7 | d=8 |
---|---|---|---|---|---|---|---|
v _{8,1}(d,m,s) | |||||||
2 | 2 | 0.00914441 | 0.00872875 | 0.00831310 | 0.00789744 | 0.00748179 | 0.03366804 |
3 | 2 | 0.01745750 | 0.01662619 | 0.01579488 | 0.01496357 | 0.06733609 | |
3 | 0.00010263 | 0.00019500 | 0.00027710 | 0.00166262 | |||
4 | 2 | 0.02493929 | 0.02369233 | 0.02244536 | 0.10100413 | ||
3 | 0.00029250 | 0.00055421 | 0.00374089 | ||||
4 | 0.00000114 | 0.00001539 | |||||
5 | 2 | 0.03158977 | 0.02992715 | 0.13467217 | |||
3 | 0.00055421 | 0.00498786 | |||||
4 | 0.00002053 | ||||||
6 | 2 | 0.03740894 | 0.16834021 | ||||
3 | 0.00415655 | ||||||
7 | 2 | 0.20200826 | |||||
f _{8,1}(d) | 0.00914441 | 0.02618626 | 0.04998121 | 0.07946192 | 0.11361346 | 0.72161274 | |
v _{8,2}(d,m,s) | |||||||
4 | 2 | 0.02225160 | 0.02081956 | 0.01806565 | 0.08228685 | ||
5 | 2 | 0.04163913 | 0.03569068 | 0.16259088 | |||
6 | 2 | 0.05353602 | 0.24091210 | ||||
3 | 0.00099141 | ||||||
7 | 2 | 0.32121613 | |||||
f _{8,2}(d) | 0.02225160 | 0.06245869 | 0.10729236 | 0.80799735 | |||
v _{8,3}(d,m,s) | |||||||
6 | 2 | 0.06896552 | 0.31034483 | ||||
7 | 2 | 0.62068966 | |||||
f _{8,3}(d) | 0.06896552 | 0.93103448 |
Both tables depict for k=1,2,3, v _{8,k }(d,m,s), (d,m,s)∈Ω _{8,k } and f _{8,k }(d), 2k+1≤d≤8 illustrating the numeric values of the involved probabilities. v _{8,k }(d,m,s) and f _{8,k }(d) were computed via Eqs. (29) and (17), respectively.
Discussion and further study
In this article we have derived exact closed form expressions for PMF v _{ n,k }(d,m,s), n≥3, 1≤k≤⌊(n−1)/2⌋, (d,m,s)∈Ω _{ n,k }, of the RV V _{ n,k }∣_{ n,k } defined on a 0−1 sequence of homogeneous Markov-dependent trials. The method used is a combinatorial one relied on results exploiting the internal structure of such a sequence.
As it is noticed in the Introduction the application domain of runs contains a diverse range of fields. Indicative potential ones are next discussed.
Encoding, compression and transmission of digital information calls for the understanding the distributions of runs of 1s or 0s. Such a knowledge helps in analyzing, and also in comparing, several techniques used in communication networks. In such networks 0−1 data ranging from a few kilobytes (e.g. e-mails) to many gigabytes of greedy multimedia applications (e.g. video on demand) are highly encoded, decoded and eventually proceeded under security. For details, see e.g., Sinha and Sinha (2009), Makri and Psillakis (2011a) and Tabatabaei and Zivic (2015).
An area where the study of runs of 1s and 0s has become increasingly useful is the field of bioinformatics or computational biology. For instance, molecular biologists design similarity tests between two DNA sequences where a 1 is interpreted as a match of the sequences at a given position and everything else as a 0. Moreover, the probabilistic analysis of such sequences according to the form, the length and the number of detected patterns as well as of the positions and the lengths of the segments of the sequence in which they are concentrated, probably suggests a functional reason for the internal structure of the examined sequence. The latter facts might be useful in suggesting a further investigation of the underline sequence(s) by biologists. See, e.g. Avery and Henderson (1999), Benson (1999) and Nuel et al. (2010).
Another active area where run statistics, in particular G _{ n,k } and S _{ n,k }, have interesting statistical applications is that connected to hypothesis testing; e.g., in tests of randomness. For a systematic study of such a topic, we refer among others, the works of Koutras and Alexandrou (1997) and Antzoulakos et al. (2003).
Accordingly, it is reasonable for one to use the exact expressions obtained for v _{ n,k }(d,m,s) in applications like the ones mentioned above. This is so, because this distribution, as a joint one, is more flexible than each one of its marginals which have been used in such applications. See, e.g. Lou (2003), Makri and Psillakis (2011b) and Arapis et al. (2016).
Moreover, in handling 0 - 1 sequences of a large length, with dependent or not elements, a Monte - Carlo simulation, based on Eqs. (1) - (4) would be a useful tool in obtaining approximate values for v _{ n,k }(d,m,s). In addition, the general approximating methods, suggested by Johnson and Fu (2014), might be helpful in deriving approximate values for f _{ n,k }(d).
Declarations
Acknowledgements
The authors wish to thank the Editor for the thorough reading, and the anonymous reviewers for useful comments and suggestions which improved the article.
Authors’ contributions
The authors, ANA, FSM and ZMP with the consultation of each other carried out this work and drafted the manuscript together. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- Antzoulakos, DL, Bersimis, S, Koutras, MV: On the distribution of the total number of run lengths. Ann. Inst. Statist. Math. 55, 865–884 (2003).View ArticleMATHGoogle Scholar
- Antzoulakos, DL, Chadjiconstantinidis, S: Distributions of numbers of success runs of fixed length in Markov dependent trials. Ann. Inst. Statist. Math. 53, 559–619 (2001).View ArticleMATHGoogle Scholar
- Arapis, AN, Makri, FS, Psillakis, ZM: On the length and the position of the minimum sequence containing all runs of ones in a Markovian binary sequence. Statist. Probab. Lett. 116, 45–54 (2016).View ArticleMATHGoogle Scholar
- Arapis, AN, Makri, FS, Psillakis, ZM: Distribution of statistics describing concentration of runs in non homogeneous Markov-dependent trials. Commun. Statist. Theor. Meth. (2017). doi:10.1080/03610926.2017.1337144.
- Avery, PJ, Henderson, D: Fiting Markov chain models to discrete state series such as DNA sequences. Appl. Statist. 48(Part 1), 53–61 (1999).MATHGoogle Scholar
- Balakrishnan, N, Koutras, MV: Runs and Scans with Applications. Wiley, New York (2002).MATHGoogle Scholar
- Benson, G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).View ArticleGoogle Scholar
- Eryilmaz, S: Some results associated with the longest run statistic in a sequence of Markov dependent trials. Appl. Math. Comput. 175, 119–130 (2006).View ArticleMATHGoogle Scholar
- Eryilmaz, S: Discrete time shock models involving runs. Statist. Probab. Lett. 107, 93–100 (2015).View ArticleMATHGoogle Scholar
- Eryilmaz, S: Generalized waiting time distributions associated with runs. Metrika. 79, 357–368 (2016).View ArticleMATHGoogle Scholar
- Eryilmaz, S: The concept of weak exchangeability and its applications. Metrika. 80, 259–271 (2017).View ArticleMATHGoogle Scholar
- Eryilmaz, S, Yalcin, F: Distribution of run statistics in partially exchangeable processes. Metrika. 73, 293–304 (2011).View ArticleMATHGoogle Scholar
- Feller, W: An Introduction to Probability Theory and Its Applications. 3rd Ed., Vol. I. Wiley, New York (1968).MATHGoogle Scholar
- Fu, JC, Lou, WYW: Distribution Theory of Runs and Patterns and Its Applications: A finite Markov chain imbedding approach. World Scientific, River Edge (2003).Google Scholar
- Johnson, BC, Fu, JC: Approximating the distributions of runs and patterns. J. Stat. Distrib. Appl. 1:5, 1–15 (2014).View ArticleMATHGoogle Scholar
- Koutras, MV: Applications of Markov chains to the distribution of runs and patterns. In: Shanbhag, DN, Rao, CR (eds.)Handbook of Statistics, pp. 431–472. Elsevier, North-Holland (2003).Google Scholar
- Koutras, MV, Alexandrou, V: Non-parametric randomness tests based on success runs of fixed length. Statist. Probab. Lett. 32, 393–404 (1997).View ArticleMATHGoogle Scholar
- Koutras, VM, Koutras, MV, Yalcin, F: A simple compound scan statistic useful for modeling insurance and risk management problems. Insur. Math. Econ. 69, 202–209 (2016).View ArticleMATHGoogle Scholar
- Lou, WYW: The exact distribution of the k-tuple statistic for sequence homology. Statist. Probab. Lett. 61, 51–59 (2003).View ArticleMATHGoogle Scholar
- Makri, FS, Philippou, AN, Psillakis, ZM: Success run statistics defined on an urn model. Adv. Appl. Prob. 39, 991–1019 (2007).View ArticleMATHGoogle Scholar
- Makri, FS, Psillakis, ZM: On success runs of a fixed length in Bernoulli sequences: Exact and asymptotic results. Comput. Math. Appl. 61, 761–772 (2011a).View ArticleMATHGoogle Scholar
- Makri, FS, Psillakis, ZM: On runs of length exceeding a threshold: normal approximation. Stat. Papers. 52, 531–551 (2011b).View ArticleMATHGoogle Scholar
- Makri, FS, Psillakis, ZM: On ℓ-overlapping runs of ones of length k in sequences of independent binary random variables. Commun. Statist. Theor. Meth. 44, 3865–3884 (2015).View ArticleMATHGoogle Scholar
- Makri, FS, Psillakis, ZM, Arapis, AN: Counting runs of ones with overlapping parts in binary strings ordered linearly and circularly. Intern. J. Statist. Probab. 2, 50–60 (2013).View ArticleGoogle Scholar
- Makri, FS, Psillakis, ZM, Arapis, AN: Length of the minimum sequence containing repeats of success runs. Statist. Probab. Lett. 96, 28–37 (2015).View ArticleMATHGoogle Scholar
- Mood, AM: The distribution theory of runs. Ann. Math. Statist. 11, 367–392 (1940).View ArticleMATHGoogle Scholar
- Mytalas, GC, Zazanis, MA: Central limit theorem approximations for the number of runs in Markov-dependent binary sequences. J. Statist. Plann. Infer. 143, 321–333 (2013).View ArticleMATHGoogle Scholar
- Mytalas, GC, Zazanis, MA: Central limit theorem approximations for the number of runs in Markov-dependent multi-type sequences. Commun. Statist. Theor. Meth. 43, 1340–1350 (2014).View ArticleMATHGoogle Scholar
- Nuel, G, Regad, L, Martin, J, Camproux, A-C: Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data. Algorithm Mol. Biol. 5, 1–18 (2010).View ArticleGoogle Scholar
- Riordan, AM: An Introduction to Combinatorial Analysis. Second Ed. John Wiley, New York (1964).MATHGoogle Scholar
- Sinha, K, Sinha, BP: On the distribution of runs of ones in binary trials. Comput. Math. Appl. 58, 1816–1829 (2009).View ArticleMATHGoogle Scholar
- Tabatabaei, SAH, Zivic, N: A review of approximate message authentication codes. In: Zivic, N (ed.)Robust Image Authentication in the Presence of Noise, pp. 106–127. Springer International Publishing AG, Cham (ZG), Switzerland (2015).Google Scholar