The goal is to choose parameters so that learning algorithm is the closest to of the dataset. In other words, the objective function is to minimize the cost function
#sample cost function for linear regressionimport numpy as npdef compute_cost(x, y, w, b): """ Computes the cost function for linear regression. Args: x (ndarray (m,)): Data, m examples y (ndarray (m,)): target values w,b (scalar) : model parameters Returns total_cost (float): The cost of using w,b as the parameters for linear regression to fit the data points in x and y """ # number of training examples m = x.shape[0] cost_sum = 0 for i in range(m): f_wb = w * x[i] + b cost = (f_wb - y[i]) ** 2 cost_sum = cost_sum + cost total_cost = (1 / (2 * m)) * cost_sum return total_costJ = compute_cost(X, y, theta=np.array([0.0, 0.0]))print('With theta = [0, 0] \nCost computed = %.2f' % J)print('Expected cost value (approximately) 32.07\n')
Increasing the regularization parameter reduces overfitting by reducing the size of the parameters. For some parameters that are near zero, this reduces the effect of the associated features. However, extremely large might lead to underfitting
In contrast, a very small can leave overfitting unsolved.
Try different values for , each doubling the previous: