High quantile regression for extreme events

For extreme events, estimation of high conditional quantiles for heavy tailed distributions is an important problem. Quantile regression is a useful method in this field with many applications. Quantile regression uses an L 1-loss function, and an optimal solution by means of linear programming. In this paper, we propose a weighted quantile regression method. Monte Carlo simulations are performed to compare the proposed method with existing methods for estimating high conditional quantiles. We also investigate two real-world examples by using the proposed weighted method. The Monte Carlo simulation and two real-world examples show the proposed method is an improvement of the existing method.


Introduction
Extreme value events are highly unusual events that can cause severe harm to people and costly damage to the environment. Examples of such harmful events are stock market crashes, equity risks, pipeline failures, large flooding, wildfires, pollution and hurricanes. The response variable, y, of an extreme event is usually distributed according to a heavytailed distribution. It is important to estimate high conditional quantiles of a random variable y given a variable vector x = (1, x 1 , x 2 , . . . , x k ) T ∈ R p and p = k + 1.
The traditional mean linear regression is concerned with the estimation of the conditional expectation E(y|x) (Yu et al. 2003). The mean linear regression model assumes We estimate β = (β 0 , β 1 , . . . , β k ) T ∈ R p from a random sample (y i , x i ), i = 1, . . . , n , where x i is the p-dimensional design vector and y i is the univariate response variable from a continuous distribution. The least squares (LS) estimator β LS is a solution to the following equation The mean linear regression provides the mean relationship between a response variable and explanatory variables (Yu et al. 2003). However, there are limitations present in the conditional mean models. Outliers significantly affect the conditional mean models and as a result, it affects the measurement of the central location, which may be misleading. Also, when analyzing extreme value events, where the response variable y has a heavy-tailed distribution, the mean linear regression cannot be extended to non-central locations (Hao and Naiman 2007). Therefore, it cannot provide insightful information for extreme events. Quantile regression offers a more complete statistical model by specifying the changes in the high conditional quantiles and it will be used to estimate values of extreme events (Yu et al. 2003;Hao and Naiman 2007). We will study two real world examples in the following sections.

Snowfall in Buffalo (1994-2015)
Large snowstorms can be very hazardous to people's safety, communities and their properties. They can significantly reduce visibility in an area, which makes it very dangerous for densely populated areas where major car accidents can happen on the road or accidents while flying can occur. A significant amount of snow, such as 12 inches (30 cm) or more, can cave in roofs of homes and buildings, standing trees can fall down on homes and cause the loss of electricity. There have been cases of deaths due to hypothermia, infections brought on by frostbite, car accidents caused due to slippery roads, heart attacks by overexertion while shoveling heavy snow and carbon monoxide poisoning from a power outage.
In 2006, Lake Storm "Aphid" was a lake-effect snowstorm that hit Buffalo, New York with a maximum snowfall of 24 inches (61 cm) and caused 19 fatalities. The snowstorm cost an estimated $530 million in damages. Recently, in November 2014, severe lake-effect snowstorm heavily impacted areas in and around Buffalo with snowfall ranging from 5-7 feet (1.5-2.1 m) and the maximum snowfall of the storms was 88 inches (220 cm). There were records of 14 heart attacks due to overexertion and roofs collapsing due to the sheer weight of the snow. The data set was obtained from National Weather Service Forecast Office (2017) (Full data is available at http://www.weather.gov/buf ) and the daily snowfall was recorded in inches for 4478 days from January 1994 -January 2015. The snowfall was converted into centimeters and a threshold of 5 cm was considered since snowfall under 5 cm is very unlikely to cause extreme damage. As a result, there are n = 316 recorded data remaining. During January 1994 -January 2015, the top 10 largest daily snowfalls and maximum temperature in Buffalo, New York are shown in Table 1.
In Fig. 1(a), the y-axis represents the total daily snowfall (cm) and the x-axis represents the snowfall in the order of occurrence. The maximum daily snowfall occurred on December 10, 1995 with over 86.1 cm while the average daily snowfall was 11.96 cm. Figure 1( b) shows the scatter diagram of daily snowfalls greater or equal to 5 cm. It appears that it can be modeled with a quadratic polynomial relating snowfall to maximum temperature. The least squares method was performed on a polynomial mean regression model where y represents the total snowfall (cm) and x represents the maximum temperature ( • C). The red curve represents the least squares (LS) curve μ LS = μ y|x = 8.9879 − 0.2144x + 0.0040x 2 which was obtained by using (1) and the model (2) to estimate the mean of daily snowfall y for a given maximum temperature x. But, the least squares curve does not provide information about extreme heavy snowfalls that may cause damage. The quantile regression method will be able to estimate the high conditional quantiles. We will discuss this example further in Section 5.

CO 2 Emission
Climate change is considered to be one of the most important environmental issues as it is transforming life on Earth. It affects all aspects of our natural environment including the air and water quality, health and conservation of species at risk. It has been observed that temperatures and sea levels are rising, there are stronger storms and increased damage, and increased risk of drought, fire and floods. Climate change will rapidly alter the lands and waters that we depend upon for survival and we will no longer be able to preserve our environment for our social and economic well-being. Natural processes and human activities can cause climate change. The recent global warming can be largely attributed to the carbon dioxide (CO 2 ) and other greenhouse gas emissions. It was found that in 2009, CO 2 accounted for 82% of all European greenhouse gas emissions and about 94% of the CO 2 released to the atmosphere were from combusting fossil fuels (European Environment Agency (2017) at http://www.eea.europa.eu).  Figure 2(a) shows CO 2 emissions increases between 1950 and 2010; these increases are related with the increased energy use by an expanding economy, population and overall growth in emissions from electricity generation. It is important to estimate high conditional quantiles of the distribution of CO 2 emission in order to prevent acceleration of climate change.
In this paper, we use the 2010 data from the Carbon Dioxide Information Analysis Center (2017) at http://cdiac.ornl.gov for 181 countries. The CO 2 emission per capita was recorded in metric tonnes. There are n = 123 countries remaining after the threshold of 1 tonne was applied. The threshold of 1 tonne of CO 2 emission was considered since values higher than 1 tonne of CO 2 emission would exceed the maximum allowance to emit without harming the climate. Table 2 lists the top 10 countries with the largest CO 2 emissions and their gross domestic product (GDP) and electricity consumption (E.C.) per capita. In Fig. 2(b), the y-axis represents the CO 2 emission per capita (tonnes) and x-axis represents the CO 2 emission ordered by country. It can be observed in Fig. 2(b) that Qatar produced the highest CO 2 emission per capita of 40 tonnes and Trinidad and Tobago and Kuwait produced the second and third highest CO 2 emissions of 38 tonnes and 31 tonnes respectively. As well, several countries emitted less than 10 tonnes of CO 2 in 2010.
We set the CO 2 emission per capita (y) (tonnes), ln(GDP) per capita (x 1 ) (US $) and ln(E.C.) per capita (x 2 ) (kilowatts) by using log-transformation. The 3D scatter diagram  in Fig. 3 appears that it can be modeled using mean linear regression model (3) for the CO 2 emission per capita related with the ln(GDP) and ln(E.C.). For simplification, we do not put "per capita" for these three variables in the following text.
Figure 4(a) and (b) are 2D scatter plots. Figure 4(a) shows the relationship between ln(GDP) and CO 2 emission per capita μ y|x 1 = −21.0984 + 3.0255x 1 when the E.C. per capita is 2980.96 kilowatts. Figure 4(b) demonstrates the relationship between ln(E.C.) and CO 2 emission per capita μ y|x 2 = −10.2255 + 2.1830x 2 when the GDP per capita is $13,359.73. The least squares mean regression curves in Fig. 4 were obtained by using (1) and the model (3).
Since the mean regression provides only the mean relationship between CO 2 emission per capita and GDP or E.C., it cannot provide estimation for high conditional quantiles of CO 2 emission. But the quantile regression method can estimate high CO 2 emission quantile curves. We will discuss this example further in Section 5. Fig. 4 a Scatter plot and LS mean regression μ y|x1 (red) of the CO 2 emission per capita related to ln(GDP) x 1 when the E.C. is 2980 kilowatts. b Scatter plot and LS mean regression μ y|x2 (red) of the CO 2 emission per capita related to ln(E.C.) x 2 when GDP is $13,359.73

Main methods and results
Quantile regression is an important model with applications in many fields. At first, quantile regression provides the estimates of the conditional quantiles, which are difficult to capture by a mean regression. Second, it is also more robust against outliers in the response measurements. The objective of this paper is to study and explore a new weighted quantile regression in order to improve the existing methods. In this paper, we will do three studies: 1. The theoretical approach will be investigated. 2. Monte Carlo simulations will be performed to show the efficiency of the new weighted method relative to the existing methods. 3. The new proposed method will be applied to real-world examples on extreme events and compared to mean regression and classical quantile regression.
In Section 2, we review some notation. In Section 3, we propose an optimal weighted quantile regression method, and give its good asymptotic properties for any uniformly bounded positive weight independent of response variable y, with conditional density as the weight. In Section 4, the results of Monte Carlo simulations generated from the bivariate Pareto distribution show that the proposed weighted method produces high efficiencies relative to existing methods. In Section 5, the three regression methods: mean regression, classic quantile regression and proposed weighted quantile regression, are applied to the real-life examples: the Buffalo snowfall (Subsection 1.1) and CO 2 emission (Subsection 1.2). Three goodness-of-fit tests are used to assess the distributions of the data. Studies of the examples illustrate that the proposed weighted quantile regression model fits better to the datasets than the existing quantile regression method. Pickands (1975) first introduced the Generalized Pareto Distribution (GPD). (Also see de Haan and Ferreira 2006).

Definition 2.1
The cumulative distribution function (c.d.f.) F(x) and its corresponding probability density function (p.d.f.) f (x) of the two-parameter GPD(γ , σ ) with shape parameter γ > 0 and scale parameter σ of a random variable X are given by Koenker and Bassett (1978) proposed a L 1 −loss function to obtain estimator β(τ ) by solving

Definition 2.3 The τ th conditional linear quantile regression of y for given
where ρ τ is a loss function Quantile regression problem can be formulated as a linear program where 1 T n is an n-vector of 1s, X denotes the n×p design matrix, and u, v are n × 1 vectors with elements of u i , v i respectively (Koenker 2005). Huang et al. (2015) proposed a weighted quantile regression method

Proposed weighted quantile regression
where w i (x i , τ ) is any uniformly bounded positive weight function independent of y i , i = 1, . . . , n, for In this paper, we propose a weight as the local conditional density f (y|x) of y for given x at the τ th quantile point ξ i (τ , x i ), which is where f i (ξ i (τ , x i )) is uniformly bounded at the quantile points ξ i (τ , , x i ).
The following are the four reasons for the proposed weight in (7): (1) Koenker (2005, Chapter 5, Subsection 5.3) suggested that when the conditional densities of the response are heterogeneous, it is natural to consider whether weighted quantile regression might lead to efficiency improvements. (6) is an absolute error measure between y i and the τ th conditional quantile ξ i (τ , x i ) at the i th sample point can be interpreted as providing the local relative likelihood that the response varaibale y takes values in a neighborhood of The weighted estimator β w (τ ) in (6) using weight (7) has good properties, which we will discuss in Subsection 3.2 below. (4) Is weight (7) an optimal weight? It is a difficult problem in the field as described in Chapter 5, Subsection 5.3 in Koenker (2005). Also, f i (ξ i (τ , x i )) is difficult to estimate. In this paper, we explore these two difficulties, we estimate the f i (ξ i (τ , x i )), then obtain positive results by using weight (7).
In Section 4, in simulations, we compare using weight (7) with the weight given in Huang et al. (2014) as where In this paper, we are looking for improvement of efficiency by using weights (7) in Section 4 simulations, and applications of the Buffalo snowfall and CO 2 emission examples in Section 5.

Properties of weighted quantile regression
The following regularity conditions are necessary in deriving the asymptotic distribution of β n(w) (τ ) in (6)

Theorem 3.1 Under Conditions C1 and C2, we have
The proof of Theorem 3.1 is similar as the proof has been provided in Huang et al. (2015).

Comparison of quantile regression models
In order to compare the regular and weighted quantile regression models in (5) and (6), we extend the idea of measure of goodness of fit by Koenker and Machado (1999), and suggest the use of a Relative R(τ ), which is defined as where V regular (τ ) = where β(τ ) is obtained by (5).
where w i and β w (τ ) are given by (6).

Simulations
In this Section, Monte Carlo simulations are performed. We generate m random samples with size n each from the bivariate Pareto distribution (Mardia 1962) with c.d.f.
It reveals that the proposed Q W (f ) (τ |x) is unbiased and produces more accurate β W (f )0 (τ ) and β W (f )1 (τ ) estimators to the true β 0 and β 1 for τ = 0.95 and 0.97. As well, the variances of Q W (f ) (τ |x) are smaller relative to Q R (τ |x) for τ = 0.95 and 0.97.

Real examples of applications
In this section, we applied the following three regression models to the Buffalo snowfall and CO 2 emission examples in Section 1: (1) ; 2. The regular quantile regression Q R estimator β(τ ) in (5) ; 3. The proposed weight quantile regression Q W estimator β W (τ ) in (6) with weight (7) .  (7), we use kernel density estimation (Silverman 1986;Scott 1992).

The traditional mean linear regression (LS) estimator β LS in
where f (y, x) is an estimator of the joint density of y and x, and μ(x) is an estimator of marginal density of x. We estimate the conditional quantile function ξ(τ |x) by inverting an estimated conditional c.d.f. F(y|x) (Li and Racine 2007) Note that for a one-dimensional random sample X 1 , X 2 , . . . , X n from the distribution μ(x), a kernel density estimator for μ(x) is given by  Table 5 The goodness of fit tests for Buffalo snowfall example where h is the window width, and K(x) is the kernel function, which is a symmetric probability density function with the conditions The optimal window width can be found by The d-dimensional multivariate kernel density estimator is defined by where h is the window width and the kernel function where S is the sample covariance matrix of the data, K is the normal kernel and the function k is given by An estimator for the optimal window width h will be given by where A(K) = {4/(d + 2)} 1/(d+4) is the constant for a multivariate normal kernel.

Buffalo snowfall example
Now, recall the Buffalo snowfall example in Subsection 1.1. We use a polynomial mean regression model (2) E y|x, where y is the daily snowfall (cm) and x is the maximum temperature ( • C). But the least squares curve only estimates average daily snowfall for a given maximum temperature; it cannot estimate extreme heavy snowfalls. The quantile regression method can estimate high conditional quantile curves and will be shown this Section. In order to fit the Buffalo snowfall data to the GPD model (4), the data was transformed to y = x−μ σ , where μ = 5 cm, and then, we used the maximum likelihood estimates (MLEs) of the parameters, σ MLE = 5.1552, γ MLE = 0.2636, for the 2-parameter GPD model from the Buffalo snowfall data.  Furthermore, Fig. 8(a) and (b) shows the log-log plot and histogram of Buffalo snowfall with GPD model with the MLEs of the parameters. It illustrates that most daily snowfalls in Buffalo are between 0 and 10 centimeters and there are some occurrences of heavy snowfall, such as 50-90 centimeters. The GPD curve follows the shape of the Buffalo snowfall data very well.
Three goodness-of-fit tests are performed: the Kolmogorov-Smirnov test (K − S) (Kolmogorov 1933  where F(y) is the true but unknown distribution function of the sample and F * (y) is the theoretical distribution function, GPD in (4).
In Table 5, the K − S, A − D and C − v − M tests show that the GPD model fits the data with a probability of 60.55%, 59.05% and 78.56% respectively. Instead of using model (2), we use the following quantile regression model: where we use the estimated weight w i (x i , τ ) = f i ( ξ(τ )) in (18). Figure 9 shows the scatter plot of the daily snowfall with the fitted μ LS , Q R and Q W curves at two high 0.95th and 0.97th quantiles. It is interesting to note that at the 0.95th and 0.97th quantiles, the Q R and Q W curves appear to fit the data. Table 6 lists the estimated Buffalo snowfall quantile values at a given maximum temperature for τ =0.95 and 0.97. Both Fig. 9 and Table 6 demonstrate that when quantiles are high, Q W have heavier snowfall than Q R . Figure 10(a) and Table 7 show the values of the Relative R(τ ) in (9) for given τ = 0.95, . . . , 0.99. We note that R(τ ) > 0, which means that V weighted (τ ) < V regular (τ ), and the Q W is a better fit to the data than the Q R . Figure 10(b), (c) and Table 8 show the values of β 1 (τ ) and β 2 (τ ). The values of β 1 (τ ) and β 2 (τ ) are consistent with Fig. 9(a), (b) and Table 6.  The proposed weighted quantile regression Q W predicts that for moderate temperatures, such as 5 • C to 10 • C, it is likely to have small snowfalls in Buffalo, and for every low temperatures, such as −15 • C to 0 • C, it is more likely to have heavy snowfalls that may cause damage. Predicting heavy snowfalls is related to cold weather forecasts. Quantile regression is useful for predicting extreme heavy snowfalls.

CO 2 emission example
In Subsection 1.2, there is a relationship between ln(GDP) x 1 and ln(E.C.) x 2 and CO 2 emissions per capita y. The least squares estimate is: The quantile regression method can estimate high conditional quantile curves and will be shown in detail in this Section.
Similar to the Buffalo snowfall example, we fit the GPD model in (4) with MLEs of the parameters, σ MLE = 5.3011, γ MLE = 0.1234, to the CO 2 emission data, which is demonstrated in Fig. 11(a), (b) by the log-log plot and histogram. The GPD model follows the shape of the CO 2 emission data very well. Table 9 shows the results of the three goodness-of-fit tests.
We use the proposed weight (18) on the quantile regression model: Figure 12(a) shows the scatter plot of CO 2 emission vs ln(GDP) when the country's E.C. is 2980.96 kilowatts with the fitted μ LS , Q R and Q W curves at the 0.97th quantile. Figure 12(b) shows the scatter plot of CO 2 emission vs ln(E.C.) when the country's GDP Table 9 The goodness of fit tests for CO 2 emission example  Figure 12(c) shows the 3D scatter plot with Q R (red) and Q W (green) of CO 2 emission given the ln(GDP) and ln(E.C.) at τ = 0.97. It is important to note that the μ LS is the red solid line and the Q R and Q W quantile regression lines appear to fit the data. In general, the Q W line produces a different estimated CO 2 emissions than Q R curve at high quantiles. Tables 10  and 11 provide details about countries' CO 2 emission at high quantile (τ = 0.97) when the countries consume 2980.96 kilowatts of electricity and have a GDP of $13,359.73 respectively. Figure 13(a) shows the Relative R(τ ), which is defined in (9) and Table 12 shows the values for Relative R(τ ) for τ ≥ 0.95. All values of Relative R(τ ) are larger than 0, which signifies that V weighted (τ ) < V regular (τ ) and as well, it suggests that the weighted quantile regression model Q W is a better fit to the CO 2 emission data than the regular quantile regression model Q R . Figure 13(b) and (c) shows the values of β 1 (τ ) and β 2 (τ ) for a given τ  respectively, which is also shown in Table 13. The values of β 1 (τ ) and β 2 (τ ) are consistent with Tables 10 and 11. It can be concluded that countries with higher gross domestic product and higher amounts of electricity produce higher CO 2 emissions. Since CO 2 is not destroyed over time, it can remain in the atmosphere for thousands of years due to the very slow process by which carbon is transferred to ocean sediments. As a result, countries should monitor their CO 2 emissions in order to prevent further damages to the environment. Countries can consider producing more energy from renewable sources, such as wind, solar, hydro and geothermal heat and using fuels with lower carbon content to reduce carbon emissions.

Conclusions
After the studies above, we can conclude: 1. Traditional mean regression are concerned with estimating the conditional mean by using the L 2 -loss function. Quantile regression with a L 1 -loss function overcomes the limitations of traditional mean regression. It gives estimates of τ th conditional quantiles besides the measures of central tendency. Estimation of high conditional quantiles is very useful for the analysis of extreme events. 2. The proposed weighted quantile regression method with the local conditional density as the weight has good mathematical asymptotic properties. 3. The Monte Carlo computational simulation results show that the proposed weighted quantile regression with the local conditional density as the weight is more efficient relative to the classical quantile regression and some existing weighted quantile regression estimators. 4. The proposed weighted quantile regression can be used to predict extreme values of snowfall and CO 2 emission real world examples. In the Buffalo snowfall example, communities can use the information that quantile regression provides to prevent car accidents on roads, overexertion, and collapsing of homes. In the CO 2 emission example, the countries' increase in gross domestic product and electricity consumption will likely cause an increase in the CO 2 emissions. CO 2 emission levels should be monitored to reduce the amount of carbon dioxide in the atmosphere and its long term effects. 5. It is difficult to estimate the proposed conditional density weight. The nonparametric kernel density estimation method is successful in this paper. Further studies on optimal weights are suggested.