- Open Access
Extinction in a branching process: why some of the fittest strategies cannot guarantee survival
© Sawaya and Klaere; licensee Springer. 2014
Received: 22 December 2013
Accepted: 6 May 2014
Published: 16 June 2014
Biological fitness is typically measured by the expected rate of reproduction, but strategies with high fitness can also have high probabilities of extinction. Likewise, gambling strategies with a high expected payoff can also have a high risk of ruin. We take inspiration from the gambler’s ruin problem to examine how extinction is related to population growth. Using moment theory we demonstrate how higher moments can impact the probability of extinction and how the first few moments can be used to find bounds on the extinction probability, focusing on s-convex ordering of random variables. This approach generates “best case” and “worst case” scenarios to provide upper and lower bounds on the probability of extinction.
92D15, 60J80, 60E15
Reproduction is necessary for the survival of populations. However, a population can have a high expected reproductive rate but nevertheless go extinct with near certainty (Lewontin and Cohen 1969). For example, populations with large variation in reproductive success can sometimes have a high probability of extinction, even if they have a high expected growth (Tuljapurkar and Orzack 1980).
Similarly, investors and gamblers can avoid Gambler’s Ruin through growth of capital. However, a gambler should not simply apply the strategy with the highest expected growth rate as it may also have a high risk of ruin. For example, investors can use the Kelly ratio (Kelly 1956) to maximize expected geometric growth of their capital but strict adherence to this ratio can be risky, and playing a more conservative strategy is often recommended (MacLean et al. 2010).
To estimate the probability of Gambler’s Ruin, one can use approximations based on moments (Ethier and Khoshnevisan 2002; Canjar 2007; Hürlimann 2005). Here we apply these approaches to estimate the probability of extinction in a branching process. The mathematics of Gambler’s Ruin are very similar to that of extinction in a branching process (Courtois et al. 2006). Both statistical models involve a random variable (payoff/offspring number), resulting in a random walk (change in capital/change in population size), and an absorbing state (ruin/extinction). Moreover, both processes are assumed to be Markovian, and finding the probability of ruin/extinction involves solving for the root of a convex function.
Here we examine the random variable representing the number of offspring, and investigate how the moments of this random variable are related to the probability of extinction. We demonstrate an important relationship between these moments and extinction: odd moments favor survival and even moments favor extinction. The first moment of the offspring distribution, its mean, has the biggest influence on extinction. However, the first moment alone is not usually informative about extinction probabilities. In fact, strategies with arbitrarily large first moments can nevertheless go extinct with near certainty. Some of the “fittest” strategies can be highly unlikely to survive.
Using the first few moments of the offspring distribution, one can obtain bounds on the ultimate probability of extinction (Courtois et al. 2006; Daley and Narayan 1980). These bounds provide “best case” and “worst case” distributions. We present these bounds, termed s-convex extremal random variables, adapted from actuarial science and research on the gambler’s ruin problem (Denuit and Lefevre 1997; Hürlimann 2005; Courtois et al. 2006). The extremal distributions for discrete processes have been developed previously, using up to four moments (Hürlimann 2005). Here we find the conditions under which these extremal distributions provide non-trivial bounds. Using some simple examples, we demonstrate how these methods can be used to rank distributions using only their moments. We then discuss how these bounds can be used to better understand the evolutionary process.
2.1 Extinction in the Galton-Watson branching process
To investigate biological extinction, we use a Galton-Watson branching process in which, at each discrete time interval, every individual generates i discrete offspring with probability p i , and zero offspring with p0. Without loss of generality we assume that an individual produces its offspring and then dies, so that each individual in a population is restricted to a single generation. The offspring number is a random variable, which we denote by X. Let n be the maximum value of X so that X takes values in the state space
So we can solve for extinction in the case of Z0=1 and extend the results to larger starting populations if necessary.
The recursive formula for finding q can be found through a first step analysis (Kimmel and Axelrod 2002). The probability that the lineage of a single individual eventually goes extinct is the probability that it dies without offspring (p0) plus the probability that it produces a single offspring whose lineage dies out (p1q) plus the probability that it produces two offspring whose joint lineages die out (p2q2), and so on.
The probability of extinction of a branching process starting with a single individual is the smallest root of the equation f(q)=q for q∈[ 0,1]. The solution q=1 is always a root of (1) and is not necessarily the smallest positive root. In some cases, the probability of extinction is trivially obvious. For instance, if p0=0 individuals always produces at least one offspring, therefore q=0. Furthermore, cases where always yield q=1 (Kimmel and Axelrod 2002).
Inferring the probability of extinction analytically for branching processes with p0>0 and can be difficult because (1) has n complex-valued roots according to the fundamental law of algebra. In the following we illustrate how (1) can be seen in terms of moments of the offspring distribution, and discuss how this approach can be used to estimate q.
2.2 Moments of the branching process
Let denote the k th moment of the branching process generator X. The first moment, m1, is equivalent to the average offspring number. Higher moments can be used to obtain other summary statistics of the distribution, such as the variance .
for s≥3 are only accurate when q is large and the moments are small. As q ↓ 0, the series requires more and more terms to provide accurate approximation. Therefore, when q is small the first few moments are not necessarily informative about the probability of extinction.
2.3 s-Convex orderings of random variables
This provides an upper limit on extinction because this generating function will be greater than or equal to the generating function for all other random variables with the same m1 and n, on q∈ [ 0,1].
This extremal random variable represents a best case scenario. However, since m1>1, α must be larger than zero and this branching process has no chance of death (i.e. p0=0) and consequently no chance of extinction (q=0). Therefore does not provide a useful bound on the probability of extinction as the bound q≥0 is obvious.
This bound and all other bounds examined here can be found using discrete Chebyshev systems (Denuit and Lefevre 1997). However, extremal bounds are perhaps more intuitive for continuous random variables, to which the discrete cases can be seen as similar (Shaked and Shanthikumar 2007; Hürlimann 2005; Denuit et al. 1999). For example, in the continuous case has only one possible value, m1 with . By (2) this is clearly an extrema because (m i )(i+1)/i=mi+1=(m1)i+1. In comparison, the discrete case (3) has similar properties.
The reader should recognize this notation as it is simply the iterative forward difference operator Δ k for moments.
provides an upper bound to the probability of extinction, so that (4) has greater values at any q∈[ 0,1) than the probability generating functions of any other random variable in .
In this case, successive moments simply grow by m2/m1, so that mi+1=m i (m2/m1), providing a minimum on . And, as was the case for the minimum on , the discrete minimum extrema on has similar properties to the continuous minimum extrema.
For both and the discrete cases are simply discretization of the continuous case. However, this is not necessarily the case for higher moment spaces (Courtois et al. 2006). While the continuous cases provide more intuitive extrema, derivation of the discrete case for higher moments is not as simple as deriving the continuous case and discretizing.
Since can only provide non-trivial information about q if p0>0, this extremal distribution is only informative about extinction when α=0 and p α >0, which is the case whenever n m1−m2<n−m1. Although this requirement may appear restrictive, some classes of distributions have simple rules under which is informative. For example, for binomial distributions, Bn,p, will provide a non-zero lower bound if 1/n<p≤1/(n−1).
and furthermore, if then p n →0. Therefore, the resulting generating function for is identical to (4) if the maximal value is unknown. So, like the first moment, the third moment is uninformative about extinction when n is unknown, unless assumptions are made about the distribution (see e.g., (Daley and Narayan 1980; Ethier and Khoshnevisan 2002)).
And if , then the bound is useful because the resulting has p0>0. Alternatively, if the supports for have p0=0 and consequently q=0.
Courtois et al. (Courtois et al. 2006) explained that there is no analytic form to directly obtain α and β for . They showed this by disproving the intuitive idea that the discrete support encloses the continuous support. To find α and β, we iteratively search all possible supports on D n until both inequalities are satisfied. This exhaustive method for finding the supports for this extrema is not ideal, especially if D n is dense. Linear programing can be used to easily find the extremal supports and their probabilities (Prékopa 1990), but such approaches are not necessary when D n is sparse (e.g., when n is relatively small).
If the resulting in the inequality holds, the bound for is informative.
All extrema rely on the maximum offspring number, n. Similar to , when n is unknown or infinity goes to the minimum on the lower moment space, here . Thus if n is unknown, goes to , at least for the cases examined here.
The Chebychev approach can be used to extend this approach to higher moments (Hürlimann 2005). However, moments above the fourth are rarely used, and higher moments can be difficult to estimate from small samples. Further, the equations for the supports and probabilities for moments above the fourth become increasingly complex.
3 Results and discussion
Here we discuss some example distributions, graph their generating functions, and also graph generating functions for the extremal distributions. The plot of the probability generating function, f(q), on q∈(0,1) is a useful way to visualize how the moments are related to extinction. The probability generating function takes the value p0 at q=0. At small q, f(q) has a slope of approximately p1. In this part of the function, when q is small, there can be a weak relationship between f(q) and moments. In comparison, when q is close to 1, the moments are closely related to f(q). For example f′(1)=m1. Higher moments begin to influence the function as q moves away from 1.
The probability of extinction of a process is found when f(q)=q, i.e at the intersect between its probability generating function f(q) and the diagonal q. Thus, processes with a high probability of extinction will cross the diagonal near q=1, in the domain of q in which the probability generating function is often closely related to its first few moments.
Importantly, these examples demonstrate why higher moments are often necessary to compare strategies. These two distributions have identical first moments (m1=2) so classically their fitness value would be equal. However, the binomial example is more likely to survive. If entire distributions are known, then extinction probabilities can be calculated explicitly using (1). This requires solving a polynomial of degree 20 for the examples shown here, which were solved in R (R Development Core Team 2011) with the package “rootSolve” (Soetaert and Herman 2009).
Extinction probabilities and supports for the extremal distributions of the Binomial example B 20,0.1
Lower bound (best)
Upper bound (worst)
Extinction probabilities and supports for the extremal distributions of the truncated Geometric example
Lower bound (best)
Upper bound (worst)
And finally, these examples can be used to better understand how ranking distributions using their s-convex extrema can be useful in investing and gambling (Canjar 2007; Courtois et al. 2006; Denuit and Lefevre 1997; Ethier and Khoshnevisan 2002; Hürlimann 2005). If these distributions were returns on an investment or gamble, then by comparing their moments an investor could determine that the binomial distribution is a superior investment model. Both distributions would provide the same expected growth on capital, but the geometric distribution would have a higher probability of gambler’s ruin. Being wary of gambler’s ruin is especially important for an investor with limited initial funds for their investment.
The work here is intended to highlight the relationship between the moments of the offspring distribution and the probability of extinction. Extinction can be defined in terms of moments, but the first few moments are only informative about extinction under certain conditions. Nevertheless, for all offspring distributions there exists an interesting relationship with even and odd moments: high even moments favor extinction, high odd moments favor survival. This relationship between even and odd moments is also seen in the stochastic Price equation, where relative growth rates increase with increasing odd moments, and decrease with increasing even moments (Rice 2008).
The relationship between moments and extinction can provide insight into the evolutionary process. A high first moment can favor survival, but worst case extrema (“long shots”) represent the strategies that are least likely to survive. Strategies with a relatively low second moment (low variance) will always have a lower probability of extinction than their corresponding “long shot” extrema. When two moments are known, the worst case distributions have the lowest third moment (strongest right skew). Therefore, strategies with identical first and second moments and relatively high third moments (strong left skew) will always have a better chance at survival than the extrema with the lowest third moment. Worst case extrema using three moments have the highest possible fourth moment (excessive kurtosis). The relative importance of higher moments depends on the distribution, and in some cases higher moments can have a big influence on extinction.
Strategies with a high probability of extinction are unlikely to be found in natural populations, even if their expected reproductive rate is high (Tuljapurkar and Orzack 1980). New alleles will often arrive in a population as a singlet, and extinction is permanent unless the same mutation occurs more than once. In such cases, survival is more important than the average rate of reproduction. Using moments of the offspring distribution one can find bounds on extinction using their s-convex extrema. If the best case extrema for a set of moments has a high probability of extinction, then strategies with these moments will be evolutionarily unlikely, regardless of how fit these strategy would be if they avoided extinction.
Gamblers can avoid strategies with a high risk of ruin by calculating their odds. In natural populations, such calculations are not required to prevent the occurrence of high risk strategies. Instead, risky strategies will be naturally unlikely, especially considering that many arrive as a single allele with one chance at survival. Similarly, gamblers and investors who begin with limited funds and chose risky strategies are likely to “go extinct” through gambler’s ruin. Risk is not solely determined by mean growth, and strategies with a high mean can sometimes have high risk. Unfortunately, these high risk and high reward strategies are unlikely to return anything without sufficient investment, so natural avoidance of risk can result in missed opportunity for growth.
The authors like to thank Ninh Anh for a fruitful discussion on calculating the supports and probabilities for the extremal distributions, and the many constructive comments by the editor and two anonymous referees.
- Canjar RM: Gambler’s Ruin revisited: The effects of skew and large jackpots. In Optimal Play: Mathematical Studies of Games and Gambling. Institute for the Study of Gambling and Commercial Gaming. Edited by: Ethier S, Eadington W. Reno: University of Nevada;Google Scholar
- Courtois C, Denuit M, van Bellegem S: Discrete s -convex extremal distributions: theory and applications. Appl. Math. Lett 19: 1367–1377. 2006MathSciNetView ArticleGoogle Scholar
- Daley DJ, Narayan P: Series expansions of probability generating functions and bounds for the extinction probability of a branching process. J. Appl. Probab 17: 939–947. 1980MathSciNetView ArticleGoogle Scholar
- Denuit M, Lefevre C: Some new classes of stochastic order relations among arithmetic random variables, with applications in actuarial sciences. Insur. Math. Econ 20: 197–213. 1997MathSciNetView ArticleGoogle Scholar
- Denuit M, de Vylder E, Lefevre C: Extremal generators and extremal distributions for the continuous s -convex stochastic orderings. Insur. Math. Econ 24: 201–217. 1999MathSciNetView ArticleGoogle Scholar
- Denuit M, Lefevre C, Mesfioui M: On s -convex stochastic extrema for arithmetic risks. Insur. Math. Econ 25: 143–155. 1999MathSciNetView ArticleGoogle Scholar
- Ethier SN, Khoshnevisan D: Bounds on gambler’s ruin probabilities in terms of moments. Methodol. Comput. Appl. Probab 4: 55–68. 10.1023/A:1015705430513 2002MathSciNetView ArticleGoogle Scholar
- Hürlimann W: Improved analytical bounds for gambler’s ruin probabilities. Methodol. Comput. Appl. Probab 7: 79–95. 10.1007/s11009–005–6656–4 2005MathSciNetView ArticleGoogle Scholar
- Karlin S, McGregor JL: The differential equations of birth-and-death-processes, and the stieltjes moment problem. T. Am. Math. Soc 85: 489–546. 1957MathSciNetView ArticleGoogle Scholar
- Kelly J: A new interpretation of information rate. Bell Sys. Tech. J 35: 917–926. 1956MathSciNetView ArticleGoogle Scholar
- Kimmel M, Axelrod DE: Branching Processes in Biology. Springer, New York; 2002Google Scholar
- Lewontin RC, Cohen D: On population growth in a randomly varying environment. P. Natl. Acad. Sci. USA 62: 1056–1060. 1969MathSciNetView ArticleGoogle Scholar
- MacLean LC, Thorp EO, Ziemba WT: Good and bad properties of the Kelly criterion. In The Kelly Capital Growth Investment Criterion: Theory and Practice. Singapore: World Scientific Publishing;Google Scholar
- Prékopa A: The discrete moment problem and linear programming. Discrete Appl. Math 27: 235–254. 1990MathSciNetView ArticleGoogle Scholar
- R Development Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria; 2011 http://www.R-project.orgGoogle Scholar
- Rice SH: A stochastic version of the Price equation reveals the interplay of deterministic and stochastic processes in evolution. BMC Evol. Biol 8: 262. 2008View ArticleGoogle Scholar
- Shaked M, Shanthikumar JG: Stochastic Orders. Springer, New York; 2007Google Scholar
- Soetaert K, Herman PMJ: A practical guide to ecological modelling. Using r as a simulation platform. Springer, New York; 2009Google Scholar
- Tuljapurkar S, Orzack SH: Population dynamics in variable environments I. Long-run growth rates and extinction. Theor. Popul. Biol 18: 314–342. 1980MathSciNetView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.