Correlation of Linear regression with Neural Networks

Swayanshu Shanti Pragnya
3 min readJul 29, 2021

--

As the title says in this article, I will illustrate that neural networks (NNs) are just expansions of linear regression (LR).

Figure from [1].

Francis Galton was the first to introduce regression analysis in the 19th century. Galton used his skills to solve the difficulties of genetics, but nowadays, LR is used in different fields. The goal is to find a line that best fits a group of data points by minimizing the sum of squared predictions. Models that detect links among data are known as regression models.

LR models, fit this linear function y = m x + c to the data.

Single-variate LR is used when there is only one variable; multi-variate LR is used when there are multiple variables. We considered four features in this example as x1, x2, x3, x4, etc.

Each of these data items is multiplied by the weight in the fit function and then combines. The c stands for “bias” which is like the fundamental linear function y = m x + c.

Four feature inputs for a multi-variate LR. The network was created by [3].

F(x) = mx + c ……(1)

Equation (2)

We must solve for the weights ‘w’ and the bias ‘c’ to create our model. Different types of optimization techniques can be used to accomplish this. We will get a functional link between our data once we calculate weight and bias.

Equation (3)

The next method is to pass the entire regression model, and for example, here, we are using a sigmoid function to pass out the model.

Generalized linear model. The network was created by [3].

This is also known as the “Generalized linear model (GLM)”. In 1972, John Nelder and Robert Wedderburn developed the GLM. Four different distributions are used to demonstrate these GLM i.e., normal, binomial, poisson, and gamma [2]. GLM can be used even if the relationship between the response and the predictors is not linear, which is different from LR analysis.

Equation (4)

Equation (4) contains a sigmoid function, so the model can perform a classification task. Now, our linear model will be sent via a function (here, we got a generalized linear model by using the sigmoid function), and the result will be used as an input for the following linear model. Here the difference between a multiple linear regression (MLR) and a perceptron is that a perceptron translates a signal generated by a multi-linear regression into a non-linear activation function.

NNs have an activation function to add the non-linearity to the model for computing the non-linear functionalities regarding the activation function. Without an activation function, the neural network is simply a matrix multiplication of a bunch of weights with input data.

Another great article (must read: https://joshuagoings.com/2020/05/05/neural-network/ ) explains the mathematical correlation between LR and NNs. This article is inspired by his work.

References:

[1] https://stats.stackexchange.com/questions/265009/neural-net-vs-multiple-linear-regression?noredirect=1&lq=1

[2] J. A. N. a. R. W. M. WEDDERBURN, “Generalized Linear Models,” Journal of the Royal Statistical Society. Series A (General), vol. 135, pp. 370- 384, 1972.

[3] http://alexlenail.me/NN-SVG/index.html

--

--

Swayanshu Shanti Pragnya

M.S in CS Data Science and Bio-medicine(DSB)|Independant Researcher | Philosopher | Artist https://www.linkedin.com/in/swayanshu/