# High quantile regression for extreme events

## Abstract

For extreme events, estimation of high conditional quantiles for heavy tailed distributions is an important problem. Quantile regression is a useful method in this field with many applications. Quantile regression uses an L 1-loss function, and an optimal solution by means of linear programming. In this paper, we propose a weighted quantile regression method. Monte Carlo simulations are performed to compare the proposed method with existing methods for estimating high conditional quantiles. We also investigate two real-world examples by using the proposed weighted method. The Monte Carlo simulation and two real-world examples show the proposed method is an improvement of the existing method.

## Introduction

Extreme value events are highly unusual events that can cause severe harm to people and costly damage to the environment. Examples of such harmful events are stock market crashes, equity risks, pipeline failures, large flooding, wildfires, pollution and hurricanes. The response variable, y, of an extreme event is usually distributed according to a heavy-tailed distribution. It is important to estimate high conditional quantiles of a random variable y given a variable vector x=(1,x 1,x 2,…,x k )TR p and p=k+1.

The traditional mean linear regression is concerned with the estimation of the conditional expectation E(y|x) (Yu et al. 2003). The mean linear regression model assumes

$$\mu_{y|\mathbf{x}}=E\left(y|x_{1},x_{2},\ldots,x_{k}\right) =\mathbf{x}^{T} \boldsymbol{\beta }=\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\ldots +\beta_{k}x_{k}.$$

We estimate β=(β 0,β 1,…,β k )TR p from a random sample {(y i ,x i ),i=1,…,n}, where x i is the p-dimensional design vector and y i is the univariate response variable from a continuous distribution. The least squares (LS) estimator $$\widehat {\boldsymbol {\beta }}_{LS}$$ is a solution to the following equation

$$\widehat{\boldsymbol{\beta }}_{LS}=\text{arg} \underset{\boldsymbol{\beta }\in R^{p}}{\min}\sum_{i=1}^{n}\left(y_{i}-\mathbf{x}_{i}^{T}\boldsymbol{\beta}\right)^{2},$$
(1)

that is, $$\widehat {\boldsymbol {\beta }}_{LS}$$ is obtained by minimizing the L 2-distance.

The mean linear regression provides the mean relationship between a response variable and explanatory variables (Yu et al. 2003). However, there are limitations present in the conditional mean models. Outliers significantly affect the conditional mean models and as a result, it affects the measurement of the central location, which may be misleading. Also, when analyzing extreme value events, where the response variable y has a heavy-tailed distribution, the mean linear regression cannot be extended to non-central locations (Hao and Naiman 2007). Therefore, it cannot provide insightful information for extreme events. Quantile regression offers a more complete statistical model by specifying the changes in the high conditional quantiles and it will be used to estimate values of extreme events (Yu et al. 2003; Hao and Naiman 2007). We will study two real world examples in the following sections.

### 1.1 Snowfall in Buffalo (1994-2015)

Large snowstorms can be very hazardous to people’s safety, communities and their properties. They can significantly reduce visibility in an area, which makes it very dangerous for densely populated areas where major car accidents can happen on the road or accidents while flying can occur. A significant amount of snow, such as 12 inches (30 cm) or more, can cave in roofs of homes and buildings, standing trees can fall down on homes and cause the loss of electricity. There have been cases of deaths due to hypothermia, infections brought on by frostbite, car accidents caused due to slippery roads, heart attacks by overexertion while shoveling heavy snow and carbon monoxide poisoning from a power outage.

In 2006, Lake Storm “Aphid” was a lake-effect snowstorm that hit Buffalo, New York with a maximum snowfall of 24 inches (61 cm) and caused 19 fatalities. The snowstorm cost an estimated $530 million in damages. Recently, in November 2014, severe lake-effect snowstorm heavily impacted areas in and around Buffalo with snowfall ranging from 5-7 feet (1.5-2.1 m) and the maximum snowfall of the storms was 88 inches (220 cm). There were records of 14 heart attacks due to overexertion and roofs collapsing due to the sheer weight of the snow. The data set was obtained from National Weather Service Forecast Office (2017) (Full data is available at http://www.weather.gov/buf ) and the daily snowfall was recorded in inches for 4478 days from January 1994 – January 2015. The snowfall was converted into centimeters and a threshold of 5 cm was considered since snowfall under 5 cm is very unlikely to cause extreme damage. As a result, there are n= 316 recorded data remaining. During January 1994 - January 2015, the top 10 largest daily snowfalls and maximum temperature in Buffalo, New York are shown in Table 1. In Fig. 1(a), the y-axis represents the total daily snowfall (cm) and the x-axis represents the snowfall in the order of occurrence. The maximum daily snowfall occurred on December 10, 1995 with over 86.1 cm while the average daily snowfall was 11.96 cm. Figure 1(b) shows the scatter diagram of daily snowfalls greater or equal to 5 cm. It appears that it can be modeled with a quadratic polynomial relating snowfall to maximum temperature. The least squares method was performed on a polynomial mean regression model $$E\left(y|x,x^{2}\right)=\beta_{0}+\beta_{1}x+\beta_{2}x^{2},$$ (2) Fig. 1 where y represents the total snowfall (cm) and x represents the maximum temperature (°C). The red curve represents the least squares (LS) curve $$\mu _{LS}= \widehat {\mu }_{y|x}=8.9879-0.2144x+0.0040x^{2}$$ which was obtained by using (1) and the model (2) to estimate the mean of daily snowfall y for a given maximum temperature x. But, the least squares curve does not provide information about extreme heavy snowfalls that may cause damage. The quantile regression method will be able to estimate the high conditional quantiles. We will discuss this example further in Section 5. ### 1.2 CO2 Emission Climate change is considered to be one of the most important environmental issues as it is transforming life on Earth. It affects all aspects of our natural environment including the air and water quality, health and conservation of species at risk. It has been observed that temperatures and sea levels are rising, there are stronger storms and increased damage, and increased risk of drought, fire and floods. Climate change will rapidly alter the lands and waters that we depend upon for survival and we will no longer be able to preserve our environment for our social and economic well-being. Natural processes and human activities can cause climate change. The recent global warming can be largely attributed to the carbon dioxide (CO2) and other greenhouse gas emissions. It was found that in 2009, CO2 accounted for 82% of all European greenhouse gas emissions and about 94% of the CO2 released to the atmosphere were from combusting fossil fuels (European Environment Agency (2017) at http://www.eea.europa.eu). Figure 2(a) shows CO2 emissions increases between 1950 and 2010; these increases are related with the increased energy use by an expanding economy, population and overall growth in emissions from electricity generation. It is important to estimate high conditional quantiles of the distribution of CO2 emission in order to prevent acceleration of climate change. Fig. 2 In this paper, we use the 2010 data from the Carbon Dioxide Information Analysis Center (2017) at http://cdiac.ornl.gov for 181 countries. The CO2 emission per capita was recorded in metric tonnes. There are n=123 countries remaining after the threshold of 1 tonne was applied. The threshold of 1 tonne of CO2 emission was considered since values higher than 1 tonne of CO2 emission would exceed the maximum allowance to emit without harming the climate. Table 2 lists the top 10 countries with the largest CO2 emissions and their gross domestic product (GDP) and electricity consumption (E.C.) per capita. In Fig. 2(b), the y-axis represents the CO2 emission per capita (tonnes) and x-axis represents the CO2 emission ordered by country. It can be observed in Fig. 2(b) that Qatar produced the highest CO2 emission per capita of 40 tonnes and Trinidad and Tobago and Kuwait produced the second and third highest CO 2 emissions of 38 tonnes and 31 tonnes respectively. As well, several countries emitted less than 10 tonnes of CO2 in 2010. We set the CO2 emission per capita (y) (tonnes), ln(GDP) per capita (x 1) (US$) and ln(E.C.) per capita (x 2) (kilowatts) by using log-transformation. The 3D scatter diagram in Fig. 3 appears that it can be modeled using mean linear regression model (3) for the CO2 emission per capita related with the ln(GDP) and ln(E.C.). For simplification, we do not put “per capita” for these three variables in the following text.

$$E\left(y|x_{1},x_{2}\right)=\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2},$$
(3)

Figure 4(a) and (b) are 2D scatter plots. Figure 4(a) shows the relationship between ln(GDP) and CO2 emission per capita $$\widehat {\mu }_{y|x_{1}}=-21.0984+3.0255x_{1}$$ when the E.C. per capita is 2980.96 kilowatts. Figure 4(b) demonstrates the relationship between ln(E.C.) and CO 2 emission per capita $$\widehat {\mu }_{y|x_{2}}=-10.2255+2.1830x_{2}$$ when the GDP per capita is 13,359.73. The least squares mean regression curves in Fig. 4 were obtained by using (1) and the model (3). Fig. 4 Since the mean regression provides only the mean relationship between CO 2 emission per capita and GDP or E.C., it cannot provide estimation for high conditional quantiles of CO2 emission. But the quantile regression method can estimate high CO2 emission quantile curves. We will discuss this example further in Section 5. ### 1.3 Main methods and results Quantile regression is an important model with applications in many fields. At first, quantile regression provides the estimates of the conditional quantiles, which are difficult to capture by a mean regression. Second, it is also more robust against outliers in the response measurements. The objective of this paper is to study and explore a new weighted quantile regression in order to improve the existing methods. In this paper, we will do three studies: 1. 1. The theoretical approach will be investigated. 2. 2. Monte Carlo simulations will be performed to show the efficiency of the new weighted method relative to the existing methods. 3. 3. The new proposed method will be applied to real-world examples on extreme events and compared to mean regression and classical quantile regression. In Section 2, we review some notation. In Section 3, we propose an optimal weighted quantile regression method, and give its good asymptotic properties for any uniformly bounded positive weight independent of response variable y, with conditional density as the weight. In Section 4, the results of Monte Carlo simulations generated from the bivariate Pareto distribution show that the proposed weighted method produces high efficiencies relative to existing methods. In Section 5, the three regression methods: mean regression, classic quantile regression and proposed weighted quantile regression, are applied to the real-life examples: the Buffalo snowfall (Subsection 1.1) and CO2 emission (Subsection 1.2). Three goodness-of-fit tests are used to assess the distributions of the data. Studies of the examples illustrate that the proposed weighted quantile regression model fits better to the datasets than the existing quantile regression method. ## Notation Pickands (1975) first introduced the Generalized Pareto Distribution (GPD). (Also see de Haan and Ferreira 2006). ### Definition 2.1 The cumulative distribution function (c.d.f.) F(x) and its corresponding probability density function (p.d.f.) f(x) of the two-parameter GPD(γ,σ) with shape parameter γ>0 and scale parameter σ of a random variable X are given by $$F(x)=1-\left(1+\gamma \frac{x}{\sigma }\right)^{1/\gamma },\quad \gamma,\sigma >0,\quad x>0;$$ (4) $$f(x)=\sigma^{-1}\left(1+\gamma \frac{x}{\sigma }\right)^{\frac{1}{\gamma } -1},\quad \gamma,\sigma >0,\quad x>0.$$ ### Definition 2.2 The τth quantile of a continuous random variable Y with c.d.f. F(y) is defined as $$Q(\tau)=F^{-1}(\tau)=\inf \{y:F(y)\geq \tau \}\text{, \ }0<\tau <1.$$ ### Definition 2.3 The τth conditional linear quantile regression of y for given x=(1,x 1,x 2,…,x k )T is defined as \begin{aligned} Q_{y}\left(\tau |\mathbf{x}\right)&=Q_{\tau }\left(y|x_{1},x_{2},\ldots,x_{k}\right)=F^{-1}\left(\tau | \mathbf{x}\right)\\&=\mathbf{x}^{T}\boldsymbol{\beta }(\tau)=\beta_{0}(\tau)+\beta_{1}(\tau)x_{1}+\cdots &\!\!\!\!+\beta_{k}(\tau)x_{k},\text{\ }0<\tau <1, \end{aligned} where β(τ)=(β 0(τ),β 1(τ),β 2(τ),…,β k (τ))T. Koenker and Bassett (1978) proposed a L 1−loss function to obtain estimator $$\widehat {\mathbf {\beta }}(\tau)$$ by solving $$\widehat{\boldsymbol{\beta}}(\tau)=\arg\min_{\boldsymbol{\beta}(\tau)\in R^{p}}\sum_{i=1}^{n}\rho_{\tau}(y_{i}-\mathbf{x}_{i}^{T}\boldsymbol{\beta } (\tau)),\text{\ }0<\tau <1,$$ (5) where ρ τ is a loss function $$\rho_{\tau }(u)=u(\tau -I(u<0))=\left\{ \begin{array}{r} u(\tau -1),~u<0; \\ u\tau,\ u\geq 0. \end{array} \right.$$ Quantile regression problem can be formulated as a linear program $$\min_{(\boldsymbol{\beta }(\tau),\mathbf{u},\mathbf{v})\in R^{p}\times R_{+}^{2n}}\left\{\tau \mathbf{1}_{n}^{T}\mathbf{u}+(1-\tau)\mathbf{1}_{n}^{T} \mathbf{v}|\boldsymbol{X\beta }(\tau)+\mathbf{u}-\mathbf{v}=\mathbf{y}\right\},$$ where $$\mathbf {1}_{n}^{T}$$ is an n-vector of 1s, X denotes the n×p design matrix, and u , v are n × 1 vectors with elements of u i ,v i respectively (Koenker 2005). ## Proposed weighted quantile regression ### 3.1 Proposed weighted quantile regression Huang et al. (2015) proposed a weighted quantile regression method $$\widehat{\boldsymbol{\beta }}_{w}(\tau)=\mathbf{\arg }\min_{\boldsymbol{\beta} (\tau)\in R^{p}}\sum_{i=1}^{n}w_{i}\left(\mathbf{x}_{i},\tau \right)\rho_{\tau}\left(y_{i}-\mathbf{x}_{i}^{T}\boldsymbol{\beta }(\tau)\right),\;0<\tau <1,$$ (6) where w i (x i ,τ) is any uniformly bounded positive weight function independent of y i , i=1,…,n, for x i =(1,x i1,x i2,…,x ik )T. In this paper, we propose a weight as the local conditional density f(y|x) of y for given x at the τth quantile point ξ i (τ,x i ), which is $$w_{i}\left(\mathbf{x}_{i},\tau \right)=f_{i}\left(\xi_{i}\left(\tau,\mathbf{x}_{i}\right)\right),\;i=1,2,\ldots,n,\;0<\tau <1,$$ (7) where f i (ξ i (τ,x i )) is uniformly bounded at the quantile points ξ i (τ,,x i ). The following are the four reasons for the proposed weight in (7): 1. (1) Koenker (2005, Chapter 5, Subsection 5.3) suggested that when the conditional densities of the response are heterogeneous, it is natural to consider whether weighted quantile regression might lead to efficiency improvements. 2. (2) The error function $$\rho _{\tau }(y_{i}-\mathbf {x}_{i}^{T}\boldsymbol {\beta } (\tau))$$ in (6)is an absolute error measure between y i and the τth conditional quantile ξ i (τ,x i ) at the i th sample point (y i ,x i ), i=1,2,…,n. f i (ξ i (τ,x i )) can be interpreted as providing the local relative likelihood that the response varaibale y takes values in a neighborhood of y=ξ i (τ,x i ):(ξ i (τ,x i )−ε, ξ i (τ,x i )+ε), for small ε>0. Giving the weight f i (ξ i (τ,x i )) on $$\rho _{\tau }(y_{i}-\mathbf {x}_{i}^{T} \boldsymbol {\beta }(\tau))$$ will make the total error $$\sum _{i=1}^{n}w_{i}(\mathbf {x}_{i},\tau)\rho _{\tau }(y_{i}-\mathbf {x}_{i}^{T}\boldsymbol {\beta } (\tau))$$ more reasonable. 3. (3) The weighted estimator $$\widehat {\boldsymbol {\beta }}_{w}(\tau)$$ in (6)using weight (7)has good properties, which we will discuss in Subsection 3.2 below. 4. (4) Is weight (7) an optimal weight? It is a difficult problem in the field as described in Chapter 5, Subsection 5.3 in Koenker (2005). Also, f i (ξ i (τ,x i )) is difficult to estimate. In this paper, we explore these two difficulties, we estimate the f i (ξ i (τ,x i )), then obtain positive results by using weight (7). In Section 4, in simulations, we compare using weight (7)with the weight given in Huang et al. (2014) as $$w_{i}\left(\mathbf{x}_{i},\tau \right) =\frac{\|\mathbf{x}_{i}\|^{-1}}{\sum\limits_{i=1}^{n}\|\mathbf{x}_{i}\|^{-1}},\quad 0<\tau <1,$$ (8) where w i (x i ,τ) [ 0,1] and $$\sum \limits _{i=1}^{n}w_{i}(\mathbf {x}_{i},\tau)=1,\;i=1,\ldots,n, \left \vert \left \vert \mathbf {x}_{i}\right \vert \right \vert =\sqrt { x_{i1}^{2}+x_{i2}^{2}+\cdot \cdot \cdot +x_{ik}^{2}},$$ k is the number of regressors. In this paper, we are looking for improvement of efficiency by using weights (7) in Section 4 simulations, and applications of the Buffalo snowfall and CO2 emission examples in Section 5. ### 3.2 Properties of weighted quantile regression The following regularity conditions are necessary in deriving the asymptotic distribution of $$\widehat {\boldsymbol {\beta }}_{n(w)}(\tau)$$ in (6)with weight w i (x i ,τ)=f i (ξ i (τ,x i )), i=1,2,…, 0<τ<1, in (7). Let Y 1,Y 2,… be independent random variables with distribution function F 1,F 2,…. ### Condition 1 (C1). The F i ’s are absolutely continuous, with continuous densities f i (ξ) uniformly bounded away from 0 and at the quantile points ξ i (τ,x i ),i=1,2,…. ### Condition 2 (C2). There exist positive definite matrices D 0(τ) such that the following three subconditions are satisfied • $${\lim }_{n\rightarrow \infty }n^{-1}\sum f_{i}^{2}(\xi _{i}(\tau, \mathbf {x}_{i}))\mathbf {x}_{i}\mathbf {x}_{i}^{T}=\mathbf {D}_{0}(\tau),$$ and • $$\lim \limits _{i=1,..,n}\left \Vert f_{i}(\xi _{i}(\tau,\mathbf {x} _{i}))\right \Vert$$/$$\sqrt {n}\rightarrow 0.$$ We have the main asymptotic results for $$\widehat {\mathbf {\beta }} _{n(w)}(\tau)$$ in (6) using weight (7). In this case, we let $$\widehat {\boldsymbol {\beta }}_{n(w)}(\tau)=\widehat {\boldsymbol {\beta }}_{n(f)}(\tau)$$ in the following theorem. ### Theorem 3.1 Under Conditions C1 and C2, we have $$\sqrt{n}\left(\widehat{\boldsymbol{\beta }}_{n(f)}(\tau)-\boldsymbol{\beta }(\tau)\right) \overset{\mathcal{D}}{{\LARGE \rightarrow }} N(0,\;\tau (1-\tau)\mathbf{D}_{0}^{-1}(\tau)), \textit{ as } n\rightarrow \infty.$$ The proof of Theorem 3.1 is similar as the proof has been provided in Huang et al. (2015). ### 3.3 Comparison of quantile regression models In order to compare the regular and weighted quantile regression models in (5)and (6), we extend the idea of measure of goodness of fit by Koenker and Machado (1999), and suggest the use of a Relative R(τ), which is defined as $$Relative~R(\tau)=1-\frac{V_{weighted}(\tau)}{V_{regular}(\tau)},\quad -1\leq R(\tau)\leq 1,$$ (9) where $$V_{regular}(\tau)=\sum_{y_{i}\geq \mathbf{x}_{i}^{T}\widehat{\boldsymbol{\beta}}(\tau)}\frac{\tau}{n}\left\vert y_{i}-\mathbf{x}_{i}^{T}\widehat{\boldsymbol{\beta }}(\tau)\right\vert+\sum_{y_{i}<\mathbf{x}_{i}^{T}\widehat{\boldsymbol{\beta }}(\tau)}\frac{(1-\tau)}{n}\left\vert y_{i}-\mathbf{x}_{i}^{T}\widehat{\boldsymbol{\beta }}(\tau)\right\vert,$$ where $$\widehat {\mathbf {\beta }}(\tau)$$ is obtained by (5). $$V_{weighted}(\tau)=\sum_{y_{i}\geq \mathbf{x}_{i}^{T}\widehat{\boldsymbol{\beta }}_{w}(\tau)}w_{i}\tau \left\vert y_{i}-\mathbf{x}_{i}^{T}\widehat{\boldsymbol{ \beta }}_{w}(\tau)\right\vert +\sum_{y_{i}<\mathbf{x}_{i}^{T}\widehat{ \boldsymbol{\beta }}_{w}(\tau)}w_{i}(1-\tau)\left\vert y_{i}-\mathbf{x}_{i}^{T} \widehat{\boldsymbol{\beta }}_{w}(\tau)\right\vert,$$ where w i and $$\widehat {\mathbf {\beta }}_{w}(\tau)$$ are given by (6). ## Simulations In this Section, Monte Carlo simulations are performed. We generate m random samples with size n each from the bivariate Pareto distribution (Mardia 1962) with c.d.f. $$F(x,y)=1-x^{-\alpha }-y^{-\alpha }-(x+y-1)^{-\alpha },\;x>1,\;y>1,\;\alpha >0,$$ (10) and the conditional quantile function of (10) is $$\xi (\tau |x)=Q_{y}(\tau |x)=1-x\left(1-\frac{1}{(1-\tau)^{-1/\left(\alpha +1\right)}}\right),x>1,\;\alpha >0,\quad 0<\tau <1.$$ (11) The conditional density of y for given x is $$f(y|x)=\frac{(\alpha +1)x^{(\alpha +1)}}{(x+y-1)^{(\alpha +2)}},\;x>1,\;y>1,\;\alpha >0,$$ and the τth conditional density of y for given x at the τth quantile is $$f(\xi (\tau |x))=\frac{(\alpha +1)(1-\tau)^{(\alpha +2)/(\alpha +1)}}{x},\;x>1,\;\alpha >0,\quad 0<\tau <1.$$ (12) Assume that the true conditional quantile is Q y (τ|x)=β 0(τ)+β 1(τ)x. We use two quantile regression methods: • The regular quantile regression Q R (τ|x) estimation based on (5), $$Q_{R}(\tau |x)=\widehat{\beta }_{0}(\tau)+\widehat{\beta }_{1}(\tau)x$$ (13) • The weighted quantile regression Q W (τ|x) estimation based on (6) $$Q_{W}(\tau |x)=\widehat{\beta }_{w0}(\tau)+\widehat{\beta }_{w1}(\tau)x.$$ (14) For each method, we generate size n=300,m=1,000 samples. Q R,i (τ|x) or Q W,i (τ|x), i=1,…m, are estimated in the i th sample. Let α=3 in (12), then the weights in (7) are $$w_{i}(\mathbf{x}_{i},\tau)=\frac{4(1-\tau)^{(5/4)}}{x_{i}},\;x_{i}>1,\;i=1,2,\ldots,n.$$ (15) The simulation mean squared errors (SMSE) of the estimators (13) and (14) are: $$\begin{array}{@{}rcl@{}} SMSE(Q_{R}(\tau)) &=&\frac{1}{m}\sum_{i=1}^{m}\int_{1}^{N}(Q_{R,i}(\tau |x)-Q_{y}(\tau |x))^{2}dx; \end{array}$$ (16) $$\begin{array}{@{}rcl@{}} SMSE(Q_{W}(\tau)) &=&\frac{1}{m}\sum_{i=1}^{m}\int_{1}^{N}(Q_{W,i}(\tau |x)-Q_{y}(\tau |x))^{2}dx, \end{array}$$ (17) where the true τth conditional quantile Q y (τ|x) is defined in (11). N is a finite x value such that the c.d.f. in (10) F(N,N)≈1. We take N=1000 and the simulation efficiencies (SEFF) are given by $$SEFF(Q_{W}(\tau))=\frac{SMSE(Q_{R}(\tau))}{SMSE(Q_{W}(\tau))},$$ where S M S E(Q R (τ)) and S M S E(Q W (τ)) are defined in (16) and (17) respectively. Table 3 displays the S E F F(Q W(f)(τ)) for varying τ values by using the weight in (15). It shows that the S E F F(Q W(f)(τ)) are larger than 1 when τ=0.95,…,0.99. Figure 5 compares the S M S E(Q R (τ)) with the S M S E(Q W(f)(τ)) for τ=0.95,…,0.99. It demonstrates that all S M S E(Q W(f)(τ)) for our proposed weight in (15) have smaller values than S M S E(Q R (τ)). Furthermore, Fig. 6 shows the box plots for estimating the true β 0 and β 1 when α=3 by using Q R (τ|x) and Q W(f)(τ|x) for τ=0.95 and 0.97 respectively. It reveals that the proposed Q W(f)(τ|x) is unbiased and produces more accurate $$\widehat {\beta }_{W(f)0}(\tau)$$ and $$\widehat { \beta }_{W(f)1}(\tau)$$ estimators to the true β 0 and β 1 for τ=0.95 and 0.97. As well, the variances of Q W(f)(τ|x) are smaller relative to Q R (τ|x) for τ=0.95 and 0.97. Fig. 5 Fig. 6 Next, we want to compare our simulation results with the following proposed weights presented in Huang et al. (2014) in (8). Table 4 compares the simulation S E F F(Q W(1)(τ)) by using weight $$w_{i}(\mathbf {x}_{i},\tau)=\left \Vert \mathbf {x}_{i}\right \Vert ^{-1}/\sum _{i=1}^{n}\left \Vert \mathbf {x}_{i}\right \Vert ^{-1}$$ in (8) and S E F F(Q W(f)(τ)) by using weight in (15) for different τ values. Also, Fig. 7 compares the S E F F(Q W(1)(τ)) with S E F F(Q W(f)(τ)) with proposed weight in (15). It reveals that the S E F F(Q W(f)(τ)) are larger than 1 and larger than S E F F(Q W(1)(τ)). Thus, Q W(f)(τ|x) is more efficient than Q W(1)(τ|x) when τ=0.95 and up to 0.99. Fig. 7 From the overall results of simulation, we can conclude that: • Table 3, Figs. 5 and 6 show that for τ=0.95,…,0.99, the proposed weighted regression Q W(f)(τ|x) with the weight (15) is more efficient relative to the regular regression Q R (τ|x). • Table 4 and Fig. 7 show that for τ=0.95,…,0.99,Q W(f)(τ|x) with the proposed weight (15)is more efficient relative to Q W(1)(τ|x) with $$w_{i}(\mathbf {x}_{i},\tau)=\left \Vert \mathbf {x} _{i}\right \Vert ^{-1}/\sum _{i=1}^{n}\left \Vert \mathbf {x}_{i}\right \Vert ^{-1}$$ in (8). ## Real examples of applications In this section, we applied the following three regression models to the Buffalo snowfall and CO2 emission examples in Section 1: • The traditional mean linear regression (LS) estimator $$\widehat {\boldsymbol {\beta } } _{LS}$$ in (1); • The regular quantile regression Q R estimator $$\widehat {\boldsymbol \beta }(\tau)$$ in (5); • The proposed weight quantile regression Q W estimator $$\widehat {\boldsymbol \beta }_{W}(\tau)$$ in(6)with weight w i (x i ,τ)=f i (ξ i (τ,x i )) in (7). To estimate the proposed local conditional density weight w i (x i ,τ)=f i (ξ i (τ,x i )) in (7), we use kernel density estimation (Silverman 1986; Scott 1992). $$\widehat{w}_{i}\left(\mathbf{x}_{i},\tau \right)=\widehat{f}_{i}\left(\widehat{\xi } _{i}\left(\tau,\mathbf{x}_{i}\right)\right),\text{ where } \widehat{f}\left(y|\mathbf{x} \right)=\frac{\widehat{f}(y,\mathbf{x})}{\widehat{\mu }(\mathbf{x})},$$ (18) where $$\widehat {f}(y,\mathbf {x)}$$ is an estimator of the joint density of y and x , and $$\widehat {\mu }(\mathbf {x)}$$ is an estimator of marginal density of x. We estimate the conditional quantile function ξ(τ|x) by inverting an estimated conditional c.d.f. $$\widehat {F}(y|\mathbf {x})$$ (Li and Racine 2007) $$\widehat{\xi }(\tau |\mathbf{x})=\widehat{Q}_{y}(\tau |\mathbf{x})=\inf \{y: \widehat{F}(y|\mathbf{x})\geq \tau \}=\widehat{F}^{-1}(\tau |\mathbf{x}),$$ where $$\widehat {F}(y|\mathbf {x})$$ is the estimated conditional c.d.f. F(y|x). Note that for a one-dimensional random sample X 1,X 2,…,X n from the distribution μ(x), a kernel density estimator for μ(x) is given by $$\widehat{\mu}(x)=\frac{1}{nh}\sum\limits_{i=1}^{n}K\left(\frac{x-X_{i}}{h}\right), \quad-\infty <x<\infty,$$ where h is the window width, and K(x) is the kernel function, which is a symmetric probability density function with the conditions $$\int\limits_{-\infty }^{\infty} K(x)dx=1,\int tK(t)dt=0, \int t^{2}K(t)dt=k_{2}\neq 0.$$ The optimal window width can be found by $$h_{opt}=k_{2}\left\{ \int K(t)^{2}\right\}^{1/5}\left\{ \int \mu^{^{\prime\prime}}(x)^{2}dx\right\}^{-1/5}n^{-1/5}.$$ The d-dimensional multivariate kernel density estimator is defined by $$\widehat{\mu}(\mathbf{x})=\frac{1}{nh^{d}}\sum\limits_{i=1}^{n}K\left\{\frac{\mathbf{x}-\mathbf{X}_{i}}{h}\right\},$$ where h is the window width and the kernel function K(x) is a function, defined for d-dimensional x , satisfying $$\int \limits _{R^{d}}K(\mathbf {x})d\mathbf {x}=1.$$ Fukunaga (1972) suggested $$\widehat{f}(\mathbf{x})=\frac{(\det \mathbf{S})^{-1/2}}{nh^{d}} \sum\limits_{i=1}^{n}k\left\{ \frac{(\mathbf{x}-\mathbf{X}_{i})^{T}\mathbf{S }^{-1}(\mathbf{x}-\mathbf{X}_{i})}{h^{2}}\right\},$$ where S is the sample covariance matrix of the data, K is the normal kernel and the function k is given by $$k(u)=\left(\frac{1}{2\pi}\right)^{d/2}\exp \left(-\frac{u}{2}\right),\quad k(\mathbf{x}^{T}\mathbf{x)}=K(\mathbf{x})=(2\pi)^{-d/2}\exp \left(\frac{1}{2}\mathbf{x}^{T}\mathbf{x}\right).$$ An estimator for the optimal window width h will be given by $$\widehat{h}_{opt}=A(K)n^{-1/(d+4)},$$ where A(K)={4/(d+2)}1/(d+4) is the constant for a multivariate normal kernel. ### 5.1 Buffalo snowfall example Now, recall the Buffalo snowfall example in Subsection 1.1. We use a polynomial mean regression model (2) $$E\left(y|x,x^{2}\right)=\beta_{0}+\beta_{1}x+\beta_{2}x^{2},$$ where y is the daily snowfall (cm) and x is the maximum temperature (°C). But the least squares curve only estimates average daily snowfall for a given maximum temperature; it cannot estimate extreme heavy snowfalls. The quantile regression method can estimate high conditional quantile curves and will be shown this Section. In order to fit the Buffalo snowfall data to the GPD model (4), the data was transformed to $$y=\frac {x-\mu }{\sigma },$$ where μ=5 cm, and then, we used the maximum likelihood estimates (MLEs) of the parameters, $$\widehat {\sigma }_{MLE}=5.1552, \widehat {\gamma }_{MLE}=0.2636,$$ for the 2-parameter GPD model from the Buffalo snowfall data. Furthermore, Fig. 8(a) and (b) shows the log-log plot and histogram of Buffalo snowfall with GPD model with the MLEs of the parameters. It illustrates that most daily snowfalls in Buffalo are between 0 and 10 centimeters and there are some occurrences of heavy snowfall, such as 50-90 centimeters. The GPD curve follows the shape of the Buffalo snowfall data very well. Fig. 8 Three goodness-of-fit tests are performed: the Kolmogorov-Smirnov test (KS) (Kolmogorov 1933), Anderson-Darling test (AD) and Cramé r-von-Mises test (CvM) (Anderson and Darling 1952) respectively. $$\begin{array}{@{}rcl@{}} H_{0} &:&F(y)=F^{\ast }(y),\text{ for all values of }y;\\ H_{1} &:&F(y)\neq F^{\ast }(y),\text{ for at least one value of }y, \end{array}$$ where F(y) is the true but unknown distribution function of the sample and F (y) is the theoretical distribution function, GPD in (4). In Table 5, the KS, AD and CvM tests show that the GPD model fits the data with a probability of 60.55%, 59.05% and 78.56% respectively. Instead of using model (2), we use the following quantile regression model: $$Q_{y}(\tau |x)=\beta_{0}(\tau)+\beta_{1}x(\tau)+\beta_{2}(\tau)x^{2},$$ where we use the estimated weight $$\widehat {w}_{i}(\mathbf {x}_{i},\tau)= \widehat {f}_{i}(\widehat {\xi }(\tau))$$ in (18). Figure 9 shows the scatter plot of the daily snowfall with the fitted μ LS ,Q R and Q W curves at two high 0.95th and 0.97th quantiles. It is interesting to note that at the 0.95th and 0.97th quantiles, the Q R and Q W curves appear to fit the data. Table 6 lists the estimated Buffalo snowfall quantile values at a given maximum temperature for τ=0.95 and 0.97. Both Fig. 9 and Table 6 demonstrate that when quantiles are high, Q W have heavier snowfall than Q R . Fig. 9 Figure 10(a) and Table 7 show the values of the Relative R(τ) in (9) for given τ=0.95,…,0.99. We note that R(τ)>0, which means that V weighted (τ)<V regular (τ), and the Q W is a better fit to the data than the Q R . Figure 10(b), (c) and Table 8 show the values of $$\widehat {\beta }_{1}(\tau)$$ and $$\widehat {\beta }_{2}(\tau)$$. The values of $$\widehat {\beta }_{1}(\tau)$$ and $$\widehat {\beta }_{2}(\tau)$$ are consistent with Fig. 9(a), (b) and Table 6. Fig. 10 The proposed weighted quantile regression Q W predicts that for moderate temperatures, such as 5 °C to 10 °C, it is likely to have small snowfalls in Buffalo, and for every low temperatures, such as −15 °C to 0 °C, it is more likely to have heavy snowfalls that may cause damage. Predicting heavy snowfalls is related to cold weather forecasts. Quantile regression is useful for predicting extreme heavy snowfalls. ### 5.2 CO2 emission example In Subsection 1.2, there is a relationship between ln(GDP) x 1 and ln(E.C.) x 2 and CO2 emissions per capita y. The least squares estimate is: $$\mu_{LS}=-22.5009+2.0708x_{1}+1.2998x_{2}.$$ The quantile regression method can estimate high conditional quantile curves and will be shown in detail in this Section. Similar to the Buffalo snowfall example, we fit the GPD model in (4) with MLEs of the parameters, $$\widehat {\sigma }_{MLE}=5.3011$$, $$\widehat {\gamma }_{MLE}=0.1234,$$ to the CO2 emission data, which is demonstrated in Fig. 11(a), (b) by the log-log plot and histogram. The GPD model follows the shape of the CO2 emission data very well. Table 9 shows the results of the three goodness-of-fit tests. Fig. 11 We use the proposed weight $$\widehat {w}_{i}(x_{i},\tau)=\widehat {f}_{i}(\widehat {\xi }(\tau))$$ in (18) on the quantile regression model: $$Q_{y}\left(\tau |x_{1},x_{2}\right)=\beta_{0}(\tau)+\beta_{1}(\tau)x_{1}+\beta_{2}(\tau)x_{2}.$$ Figure 12(a) shows the scatter plot of CO2 emission vs ln(GDP) when the country’s E.C. is 2980.96 kilowatts with the fitted μ LS ,Q R and Q W curves at the 0.97th quantile. Figure 12(b) shows the scatter plot of CO2 emission vs ln(E.C.) when the country’s GDP is13,359.73 with the fitted μ LS ,Q R and Q W curves at the 0.97th quantile. Figure 12(c) shows the 3D scatter plot with Q R (red) and Q W (green) of CO2 emission given the ln(GDP) and ln(E.C.) at τ=0.97. It is important to note that the μ LS is the red solid line and the Q R and Q W quantile regression lines appear to fit the data. In general, the Q W line produces a different estimated CO2 emissions than Q R curve at high quantiles. Tables 10 and 11 provide details about countries’ CO2 emission at high quantile (τ=0.97) when the countries consume 2980.96 kilowatts of electricity and have a GDP of \$13,359.73 respectively.

Figure 13(a) shows the Relative R(τ), which is defined in (9) and Table 12 shows the values for Relative R(τ) for τ≥0.95. All values of Relative R(τ) are larger than 0, which signifies that V weighted (τ)<V regular (τ) and as well, it suggests that the weighted quantile regression model Q W is a better fit to the CO2 emission data than the regular quantile regression model Q R . Figure 13(b) and (c) shows the values of $$\widehat {\beta }_{1}(\tau)$$ and $$\widehat {\beta }_{2}(\tau)$$ for a given τ respectively, which is also shown in Table 13. The values of $$\widehat {\beta }_{1}(\tau)$$ and $$\widehat {\beta }_{2}(\tau)$$ are consistent with Tables 10 and 11.

It can be concluded that countries with higher gross domestic product and higher amounts of electricity produce higher CO2 emissions. Since CO 2 is not destroyed over time, it can remain in the atmosphere for thousands of years due to the very slow process by which carbon is transferred to ocean sediments. As a result, countries should monitor their CO2 emissions in order to prevent further damages to the environment. Countries can consider producing more energy from renewable sources, such as wind, solar, hydro and geothermal heat and using fuels with lower carbon content to reduce carbon emissions.

## Conclusions

After the studies above, we can conclude:

• Traditional mean regression are concerned with estimating the conditional mean by using the L2-loss function. Quantile regression with a L1- loss function overcomes the limitations of traditional mean regression. It gives estimates of τth conditional quantiles besides the measures of central tendency. Estimation of high conditional quantiles is very useful for the analysis of extreme events.

• The proposed weighted quantile regression method with the local conditional density as the weight has good mathematical asymptotic properties.

• The Monte Carlo computational simulation results show that the proposed weighted quantile regression with the local conditional density as the weight is more efficient relative to the classical quantile regression and some existing weighted quantile regression estimators.

• The proposed weighted quantile regression can be used to predict extreme values of snowfall and CO2 emission real world examples. In the Buffalo snowfall example, communities can use the information that quantile regression provides to prevent car accidents on roads, overexertion, and collapsing of homes. In the CO2 emission example, the countries’ increase in gross domestic product and electricity consumption will likely cause an increase in the CO2 emissions. CO2 emission levels should be monitored to reduce the amount of carbon dioxide in the atmosphere and its long term effects.

• It is difficult to estimate the proposed conditional density weight. The nonparametric kernel density estimation method is successful in this paper. Further studies on optimal weights are suggested.

## References

1. Anderson, TW, Darling, DA: Asymptotic theory of certain “Goodness of fit” criteria based on stochastic processes. Ann. Stat. 23(2), 193–212 (1952).

2. Carbon Dioxide Information Analysis Center (2017). http://cdiac.ornl.gov. Accessed 20 Oct 2014.

3. de Haan, L, Ferreira, A: Extreme value theory: An introduction. Springer, New York (2006).

4. European Environment Agency (2017). http://www.eea.europa.eu. Accessed 20 Oct 2014.

5. Fukunaga, K: Introduction to Statistical Pattern Recognition. Academic Press, New York (1972).

6. Hao, L, Naiman, DQ: Quantile regression, Quantitative Applications in the Social Sciences Series, Vol. 149. Sage Publications, Inc, USA (2007).

7. Huang, ML, Xu, Y, Yuen, WK: On quantile regression for extremes. In: JSM Proceedings, Statistical Computation Section, pp. 561–601. American Statistical Association, Alexandria, VA (2014).

8. Huang, ML, Xu, X, Tashnev, D: A weighted linear quantile regression. J. Stat. Comput. Simul. 85(13), 2596–2618 (2015).

9. Koenker, R: Quantile regression. Cambridge University Press, New York (2005).

10. Koenker, R, Bassett, GW: Regression Quantile. Econometrica. 46, 33–50 (1978).

11. Koenker, R, Machado, JAF: Goodness of fit and related inference processes for quantile regression. J. Am. Stat. Assoc. 96(454), 1296–1311 (1999).

12. Kolmogorov, AN: Sulla determinazione empirica di une legge di distribuzione. Giornale dell’Istit. degli att. 4, 83–91 (1933).

13. Li, Q, Racine, JS: Nonparametric econometrics, theory and practice. Princeton University Press, Princeton and Oxford (2007).

14. Mardia, KV: Multivariate Pareto Distribution. Ann. Math. Stat. 33, 1008–1015 (1962).

15. National Weather Service Forecast Office (2017). http://www.weather.gov/buf. Accessed 22 Sept 2014.

16. Pickands, J: Statistical inference using extreme order statistics. Ann. Stat. 3, 119–131 (1975).

17. Scott, DW: Multivariate Density Estimation, Theory, Practice and Visualization. John Wiley & Sons, New York (1992).

18. Silverman, BW: Density estimation for statistics and data analysis. Chapman & Hall, London, UK (1986).

19. Yu, K, Lu, Z, Stander, J: Quantile regression: applications and current research areas. Statistician. 52(3), 331–350 (2003).

## Acknowledgements

We apprecaite for the reviewers’ and Editors’ very helpful comments, which helped to improve the paper.

This research is supported bythe Natural Science and Engineering Research Council of Canada (NSERC)grant MLH, RGPIN-2014-04621.

### Authors’ contributions

The authors MLH and CN carried out this work and drafted the manuscript together. Both authors read and approved the final manuscript.

### Competing interests

The authors declare that they have no competing interests.

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Author information

Authors

### Corresponding author

Correspondence to Mei Ling Huang.

## Rights and permissions 