Regression Algorithm
What is Regression:
Regression is a
statistical method used to understand the relationship between a dependent
variable and one or more independent variables. By fitting a line or curve to
the data points, regression analysis allows us to predict the value of the
dependent variable based on the known values of the independent variables. This
technique is widely used for forecasting and predicting outcomes, identifying
trends, and determining the strength and nature of relationships between
variables in various fields such as finance, economics, medicine, and social
sciences.
There are two types of
regression
1. Simple Linear Regression:
Simple linear regression
is a statistical method used to model the relationship between a single independent
variable (predictor) and a dependent variable (outcome) by fitting a linear
equation to observed data.
The linear equation has
the form
y=β0+β1x
where y is the dependent
variable, x is the independent variable, β0 is the intercept, and β1 is the slope
of the line. The intercept represents the expected value of y when x is zero,
while the slope indicates the change in y for a one-unit change in x. This
method is commonly used for prediction and forecasting, providing insights into
how changes in the independent variable are associated with changes in the
dependent variable.
For example, If we want to predict the performance of
student in the examination based on their study hours. The following table have
two variables Hours studied i.e.’ x’ variable and Exam Score is the ‘y’
variable.
Hours
Studied |
Exam
Score |
1 |
50 |
2 |
55 |
3 |
65 |
4 |
70 |
5 |
75 |
6 |
85 |
7 |
90 |
Simple Linear Regression
We will
use simple linear regression, which fits a straight line to the data. The
formula for the line (regression equation) is:
y=β0+β1x
Where:
y is
the dependent variable (Exam Score)
x is
the independent variable (Hours Studied)
β0 is
the intercept
β1 is
the slope
In the regression
following concepts are very important,
Intercept (β0): This is the value of
the dependent variable y when the independent variable x is zero. It represents
the baseline value of y before any independent variable x is considered. In
practical terms, it often provides insight into the starting point or the value
when the predictor has no effect.
Slope (β1): This indicates the rate of
change in the dependent variable y for each unit increase in the independent
variable x. It quantifies the direction and steepness of the relationship
between x and y. A positive slope indicates that an increase in x leads to an
increase in y, while a negative slope indicates the opposite.
Together, β0 and β1 form
the linear equation y=β0+β1x, which is used to predict y based on the value of x
observed in the data. This model assumes a linear relationship between the
variables and is foundational in statistical analysis for understanding and
predicting outcomes based on continuous variables.
Finding the Regression Line
To find
the regression line, we need to calculate the intercept (β0) and the slope (β1).
These are calculated using the following formulas:
β1=n(∑xy)−(∑x)(∑y)
/ n(∑x2)−(∑x)2
β0=∑y−β1(∑x)/n
Where:
- n is the number of data
points
- ∑xy is the sum of the product of x and y
- ∑x is the sum of x values
- ∑y is the sum of y values
- ∑x2 is the sum of
squares of x values
Calculations
- Sum of Hours Studied (x): 1+2+3+4+5+6+7= 28
- Sum of Exam Scores (y): 50+55+65+70+75+85+90=490
- Sum of Product of Hours and
Scores (xy): (1×50)+(2×55)+(3×65)+(4×70)+(5×75)+(6×85)+(7×90)=2760
- Sum of Squares of Hours Studied (x^2): 12+22+32+42+52+62+72=140
- Number of data points (n): 7
Now, plug
these values into the formulas:
β1=7(2760)−(28)(490)
/ 7(140)−(28)2
= 19320−13720 / 980−784
=56001 / 96 ≈28.57
β0 =490−(28.57×28) / 7
=490−799.96
/ 7 ≈−44.28
So the
regression equation is:
y=β0+β1x
y=−44.28+28.57x
Using the Regression Formula to Predict Scores
Now you can use this formula to predict a student's exam score based on
hours studied. For example, if a student studies for 5 hours:
y= −44.28+142.85
y=98.57
Download Score Data set from here (Score)
0 टिप्पण्या
कृपया तुमच्या प्रियजनांना लेख शेअर करा आणि तुमचा अभिप्राय जरूर नोंदवा. 🙏 🙏