(StatQuest) (ML Specialization) (Lab) (ML Specialization)

Using the idea of regression, draw a best-fit plane to predict a response variable using several features (predictors):

: estimated/predicted response variable
vector : weight or parameters of the model
vector : features
: bias (unexplained by the model)
: dot product

Regression coefficient VS Beta weights

A regression coefficient in multiple regression is the slope of the linear relationship between the dependent variable and the part of a predictor variable that is independent of all other predictor variables. It’s called partial slope It is difficult to compare the coefficients for different variables directly because they are measured on different scales. We can standardize variables so that they each have a standard deviation of 1. A regression weight for standardized variables is called a “beta weight” . It represents the change in standard deviations in the dependent variable associated with one standard deviation change on the predictor, if other predictors are held constant.

Process

Check assumptions
feature scaling
Compute loss and cost functions
Construct a best-fit line using gradient descent (regularization if needed)
Calculate adjusted R-squared
Calculate p-value

Loss and cost functions

mean squared error
The loss function used in regression is called mean squared error

where:

Incorporating that loss function into the cost function :
Code
import numpy as np 
 
def compute_cost(X, y, w, b): 
"""
compute cost
Args:
X (ndarray (m,n)): Data, m examples with n features
y (ndarray (m,)) : target values
w (ndarray (n,)) : model parameters  
b (scalar)       : model parameter
 
Returns:
cost (scalar): cost
"""
m = X.shape[0]
cost = 0.0
for i in range(m):                                
f_wb_i = np.dot(X[i], w) + b           #(n,)(n,) = scalar (see np.dot)
cost = cost + (f_wb_i - y[i])**2       #scalar
cost = cost / (2 * m)                      #scalar    
return cost
 
cost = compute_cost(X_train, y_train, w_init, b_init)
Link to original

Gradient descent

gradient descent for multiple variables:

where, is the number of features, parameters , , are updated simultaneously and where

m is the number of training examples in the data set
is the model’s prediction, while is the target value

Regularized linear regression

Regularization

After add regularization to cost function, we implement new gradient descent:

The regularized part decreases each iteration by a little bit.
Link to original

My (Chiffon) Nguyen

Explorer

multiple regression

Process

Loss and cost functions

mean squared error

Gradient descent

Regularized linear regression

Regularization

Graph View

Table of Contents

Backlinks