Questions tagged [regression]

This tag is for questions on (linear or nonlinear) regression, which is a way of describing how one variable, the outcome, is numerically related to predictor variables. The dependent variable is also referred to as $~Y~$, dependent or response and is plotted on the vertical axis (ordinate) of a graph.

Regression is a statistical measurement used in finance, investing and other disciplines that attempts to determine the strength of the relationship between one dependent variable (usually denoted by $~Y~$) and a series of other changing variables (known as independent variables).

Types of Regression –

  • Linear regression
  • Logistic regression
  • Polynomial regression
  • Stepwise regression
  • Stepwise regression
  • Ridge regression
  • Lasso regression
  • ElasticNet regression

The two basic types of regression are linear regression and multiple linear regression.

The general form of each type of regression is:

  • Linear regression: $~Y = a + b~X + u~$
  • Multiple regression: $~Y = a + b_1~X_1 + b_2~X_2 + b_3~X_3 + ... + b_t~X_t + u~$

Where:

  • $Y =~$ the variable that you are trying to predict (dependent variable).
  • $X =~$ the variable that you are using to predict Y (independent variable).
  • $a =~$ the intercept.
  • $b =~$ the slope.
  • $u =~$ the regression residual.

There are multiple benefits of using regression analysis. They are as follows:

$1.~$ It indicates the significant relationships between dependent variable and independent variable.

$2.~$ It indicates the strength of impact of multiple independent variables on a dependent variable.

Reference:

https://en.wikipedia.org/wiki/Regression_analysis

This tag often goes along with the tag.

2545 questions
138
votes
11 answers

What is the difference between regression and classification?

What is the difference between regression and classification, when we try to generate output for a training data set $x$?
Bober02
  • 2,310
  • 3
  • 16
  • 15
117
votes
8 answers

derivative of cost function for Logistic Regression

I am going over the lectures on Machine Learning at Coursera. I am struggling with the following. How can the partial derivative of $$J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}y^{i}\log(h_\theta(x^{i}))+(1-y^{i})\log(1-h_\theta(x^{i}))$$ where…
69
votes
11 answers

Why do we use a Least Squares fit?

I've been wondering for a while now if there's any deep mathematical or statistical significance to finding the line that minimizes the square of the errors between the line and the data points. If we use a less common method like LAD, where we…
tom
  • 3,077
  • 1
  • 24
  • 38
59
votes
5 answers

Why divide by $2m$

I'm taking a machine learning course. The professor has a model for linear regression. Where $h_\theta$ is the hypothesis (proposed model. linear regression, in this case), $J(\theta_1)$ is the cost function, $m$ is the number of elements in the…
37
votes
3 answers

Computational complexity of least square regression operation

In a least square regression algorithm, I have to do the following operations to compute regression coefficients: Matrix multiplication, complexity: $O(C^2N)$ Matrix inversion, complexity: $O(C^3)$ Matrix multiplication, complexity:…
Andree
  • 473
  • 1
  • 4
  • 6
31
votes
5 answers

Why get the sum of squares instead of the sum of absolute values?

I'm self-studying machine learning and getting into the basics of linear regression models. From what I understand so far, a good regression model minimizes the sum of the squared differences between predicted values $h(x)$ and actual values…
user153085
23
votes
3 answers

Finding the intersection point of many lines in 3D (point closest to all lines)

I have many lines (let's say 4) which are supposed to be intersected. (Please consider lines are represented as a pair of points). I want to find the point in space which minimizes the sum of the square distances to all of the lines or in other…
niro
  • 733
  • 1
  • 8
  • 20
21
votes
3 answers

Derivation of the formula for Ordinary Least Squares Linear Regression

How was the formula for Ordinary Least Squares Linear Regression arrived at? Note I am not only looking for the proof, but also the derivation. Where did the formula come from?
user26649
21
votes
4 answers

Correlation Coefficient and Determination Coefficient

I'm new to linear regression and am trying to teach myself. In my textbook there's a problem that asks "why is $R^{2}$ in the regression of $Y$ on $X$ equal to the square of the sample correlation between X and Y?" I've been throwing my head…
Scubadiver
  • 211
  • 1
  • 2
  • 3
20
votes
1 answer

On the integral $\int_{-\pi/2}^{\pi/2}\sin(x/\sin(x/\sin(x/\sin\cdots)))\,dx$

This question is the final one out of the set (see I and II), I promise! Consider $f_1(x)=\sin(x)$ and $f_2(x)=\sin\left(\frac x{f_1(x)}\right)$ such that $f_n$ satisfies the relation $$f_n(x)=\sin\left(\frac x{f_{n-1}(x)}\right).$$ To what value…
TheSimpliFire
  • 25,185
  • 9
  • 47
  • 115
18
votes
4 answers

Deriving cost function using MLE :Why use log function?

I am learning machine learning from Andrew Ng's open-class notes and coursera.org. I am trying to understand how the cost function for the logistic regression is derived. I will start with the cost function for linear regression and then get to my…
cmelan
  • 185
  • 1
  • 2
  • 7
18
votes
2 answers

Linear regression: degrees of freedom of SST, SSR, and RSS

I'm trying to understand the concept of degrees of freedom in the specific case of the three quantities involved in a linear regression solution, i.e. $SST=SSR+SSE, $ i.e. Total sum of squares = sum of squares due to regression + sum of squared…
Jarris
  • 295
  • 1
  • 2
  • 8
17
votes
3 answers

Prove $SST=SSE+SSR$

Prove $$SST=SSE+SSR$$ I start with $$SST= \Sigma (y_i-\bar{y})^2=...=SSE+SSR+ \Sigma 2( y_i-y_i^*)(y_i^*-\bar{y} )$$ and I don't know how to prove that $\Sigma 2( y_i-y_i^*)(y_i^*-\bar{y} )=0$ a note on notation: the residuals $e_i$ is…
jacob
  • 2,655
  • 2
  • 20
  • 34
17
votes
5 answers

Polynomial fitting where polynomial must be monotonically increasing

Given a set of monotonically increasing data points (in 2D), I want to fit a polynomial to the data which is monotonically increasing over the domain of the data. If the highest x value is 100, I don't care what the slope of the polynomial is at…
splicer
  • 281
  • 2
  • 8
17
votes
5 answers

Given a data set, how do you do a sinusoidal regression on paper? What are the equations, algorithms?

Most regressions are easy. Trivial once you know how to do it. Most of them involve substitutions which transform the data into a linear regression. But I have yet to figure out how to do a sinusoidal regression. I'm looking for the concept…
CogitoErgoCogitoSum
  • 3,263
  • 1
  • 17
  • 32
1
2 3
99 100