import numpy as np
import plotly.express as px
from statsmodels.nonparametric.kernel_regression import KernelReg as kr
import plotly.graph_objs as go
import pandas as pd
This notebook demostrates how you can perform Kernel Regression manually in python. While Statsmodels provides a library for Kernel Regression, doing Kernel regression by hand can help us better understand how we get to the find result.
First I will show how Kernel Regression is done using Statsmodels. Next I will show how it is done by hand, then finally overlay both plots to show that the results are the same.
To begin with, lets looks at Kernel regression by Statsmodels
We generate y values by using a lambda function. You can change the lambda function around to see what happens. The x values i.e the independent variable is controlled by new_x
where we have displaced the x value to show that you can have
np.random.seed(1)
# xwidth controls the range of x values.
xwidth = 20
x = np.arange(0,xwidth,1)
# we want to add some noise to the x values so that dont sit at regular intervals
x_residuals = np.random.normal(scale=0.2, size=[x.shape[0]])
# new_x is the range of x values we will be using all the way through
new_x = x + x_residuals
# We generate residuals for y values since we want to show some variation in the data
num_points = x.shape[0]
residuals = np.random.normal(scale=2.0, size=[num_points])
# We will be using fun_y to generate y values all the way through
fun_y = lambda x: -(x*x) + residuals
Let us plot the data. We are going to be using Plotly express through this article for all the plotting.
# Plot the x and y values
px.scatter(x=new_x,y=fun_y(new_x), title='Figure 1: Visualizing the generated data')
Our goal is to fit a curve the above data point using regression. How can we go about it? Using statsmodels it is fairly simple.
bw = [1]
kmodel = kr(endog=fun_y(new_x), exog=new_x, var_type='c', bw=bw, reg_type='lc')
pred_y, marginal_effects = kmodel.fit()
The output of kernel regression in Statsmodels non parametric regression module are two arrays.
1) The predicted y values
2) The Marginal Effects
The marginal effects are essentially the first derivative of the predicted value with respect to the independent variable for a univariate regression problem. More on marginal effects can be found here.
fig = px.scatter(x=new_x,y=fun_y(new_x), title='Figure 2: Statsmodels fit to generated data')
fig.add_trace(go.Scatter(x=new_x, y=pred_y, name='Statsmodels fit', mode='lines'))
To do Kernel regression by hand we need to understand a few things. First, here are some of the properties of the kernel.
1) The Kernel is symmetric i.e
$$ K(x) = K(-x)$$2) Area under the Kernel function is equal to 1 meaning $$\int\limits_{-\infty}^{\infty} K(x) dx = 1 $$
We are going to use a Gaussian kernel to solve this problem. The Gaussian kernel has the form:
$$K(x) \dfrac{1}{b \sqrt{2\pi}} e^{-\big(\dfrac{x - x_i}{2b}\big)^2}$$Where b is the bandwidth, $x_i$ are the points from the dependent variable and $x$ is the range of values over which we define the kernel function. In our case $x_i$ comes from new_x
kernel_x = np.arange(-xwidth,xwidth, 0.1)
bw_manual = 1
def gauss_const(h):
"""
Returns the normalization constant for a gaussian
"""
return 1/(h*np.sqrt(np.pi*2))
def gauss_exp(ker_x, xi, h):
"""
Returns the gaussian function exponent term
"""
num = - 0.5*np.square((xi- ker_x))
den = h*h
return num/den
def kernel_function(h, ker_x, xi):
"""
Returns the gaussian function value. Combines the gauss_const and
gauss_exp to get this result
"""
const = gauss_const(h)
gauss_val = const*np.exp(gauss_exp(ker_x,xi,h))
return gauss_val
# We are selecting a single point and calculating the Kernel value
input_x = new_x[0]
col1 = gauss_const(bw_manual)
col2= gauss_exp(kernel_x, input_x, bw_manual)
col3 = kernel_function(bw_manual, kernel_x, input_x)
We want to display the dataframe for a single point xi.
# Dataframe for a single observation point x_i. In the code x_i comes from new_x
data = {'Input_x': [input_x for x in range(col2.shape[0])],
'kernel_x': kernel_x,
'gaussian_const': [col1 for x in range(col2.shape[0])],
'gaussian_exp': col2,
'full_gaussian_value': col3,
'bw':[bw_manual for x in range(col2.shape[0])],
}
single_pt_KE = pd.DataFrame(data=data)
single_pt_KE
Input_x | kernel_x | gaussian_const | gaussian_exp | full_gaussian_value | bw | |
---|---|---|---|---|---|---|
0 | 0.324869 | -20.0 | 0.398942 | -206.550151 | 7.894399e-91 | 1 |
1 | 0.324869 | -19.9 | 0.398942 | -204.522665 | 5.995777e-90 | 1 |
2 | 0.324869 | -19.8 | 0.398942 | -202.505178 | 4.508467e-89 | 1 |
3 | 0.324869 | -19.7 | 0.398942 | -200.497691 | 3.356366e-88 | 1 |
4 | 0.324869 | -19.6 | 0.398942 | -198.500204 | 2.473813e-87 | 1 |
... | ... | ... | ... | ... | ... | ... |
395 | 0.324869 | 19.5 | 0.398942 | -183.842823 | 5.740986e-81 | 1 |
396 | 0.324869 | 19.6 | 0.398942 | -185.765336 | 8.395560e-82 | 1 |
397 | 0.324869 | 19.7 | 0.398942 | -187.697849 | 1.215542e-82 | 1 |
398 | 0.324869 | 19.8 | 0.398942 | -189.640362 | 1.742397e-83 | 1 |
399 | 0.324869 | 19.9 | 0.398942 | -191.592875 | 2.472757e-84 | 1 |
400 rows × 6 columns
# Plotting a scatter plot of Kernel
px.line(x=kernel_x, y=col3, title='Figure 3: Kernel function for a single input value')
We want to visual the kernel $K(x)$ for each $x_i$. Below we calculate the kernel function value and store them in a dictionary called kernel_fns
which is converted to a dataframe kernels_df
. We then use Plotly express to plot each kernel function.
## Plotting gaussian for all input x points
kernel_fns = {'kernel_x': kernel_x}
for input_x in new_x:
input_string= 'x_value_{}'.format(np.round(input_x,2))
kernel_fns[input_string] = kernel_function(bw_manual, kernel_x, input_x)
kernels_df = pd.DataFrame(data=kernel_fns)
y_all = kernels_df.drop(columns='kernel_x')
px.line(kernels_df, x='kernel_x', y=y_all.columns, title='Gaussian for all input points', range_x=[-5,20])
We will need to calculate the weight for a single input. The weight is calculated using the expression below:
$$w(x, x_i) =\dfrac{K(x- x_i)}{\sum\limits_{j=1}^{n} K(x_j)}$$
The above equation represents the weights for the $i^th$ element of new_x
where $x$ are all the elements of new_x
. The denominator is summed over all the points in new_x
. What is interesting to note here is that you are going to be using the kernels for all input points to calculate the weights. The equation above essentially scales weights between 0 and 1.
The equation above has been implemented in the function weights
which gives us the weights for a single input point. . The function takes a single input point and gives us a row of weights. It does this by looping over all the input points while implementing the above equation.
We get the predicted value for the $i_{th}$ point from :
$$\hat{y}_{i} = y_1 w_{i1} + y_2 w_{i2} + y_3 w_{i3} +...+ y_n w_{in}$$This equation is implemented in the function single_y_pred
. We take a dot product of the row of weights we get from the weights
function and the y values from our fake data. The equation above represents that dot product.
def weights(bw_manual, input_x, all_input_values ):
w_row = []
for x_i in all_input_values:
ki = kernel_function(bw_manual, x_i, input_x)
ki_sum = np.sum(kernel_function(bw_manual, all_input_values, input_x))
w_row.append(ki/ki_sum)
return w_row
def single_y_pred(bw_manual, input_x, new_x):
w = weights(bw_manual, input_x, new_x)
y_single = np.sum(np.dot(fun_y(new_x),w))
return y_single
ypred_single = single_y_pred(bw_manual, new_x[0], new_x)
The code below loops over all the input points, calculates the predicted values and appends them to Y_pred
. Once we have the predicted values all we need to do now is to visualize them.
Y_pred = []
for input_x in new_x:
w = []
Y_single = single_y_pred(bw_manual, input_x, new_x)
Y_pred.append(Y_single)
Now that we have acquired the predicted value by calculating the predicted values manually, we can compare our regression curve to the one we get from statsmodels. We overlap the fits on top of each other and fit that they match perfectly.
data= {'x': new_x, 'y': fun_y(new_x), 'y_manual': np.array(y_all)}
fig = px.scatter(x=new_x,y=fun_y(x))
fig.add_trace(go.Scatter(x=new_x, y=pred_y, name='Statsmodel KR', mode='lines'))
fig.add_trace(go.Scatter(x=new_x, y=np.array(Y_pred), name='Manual KR', mode='lines'))
This article shows how we can understand the inner workings of the kernel regression algorithm with a simple example that uses generated data. If you learnt something for this article do like and share this article.
Thank you for reading!