(ML Specialization) (Code Lab)

softmax regression is a generalized algorithm of binomial logistic regression where we want to handle multiple classes.

Softmax function

softmax

(ML Specialization) (Code Lab)

A softmax function is used in both softmax regression and in Neural Networks when solving Multi-class Classification problems: converting a vector of values to a probability distribution. In simple terms, it converts values to probability.

Given possible classes for the output , the possibility of belongs to class is computed through activation function as:

where

def softmax(z):  
    """ Softmax converts a vector of values to a probability distribution.
    Args:
      z (ndarray (N,))  : input data, N features
    Returns:
      a (ndarray (N,))  : softmax of z
    """    
    
    e_z = np.exp(z)
    a = e_z / e_z.sum(axis = 0)
    
    return a
Link to original

Cost function

A loss function called Cross Entropy is computed as:

  1. When , softmax regression does the same thing as binomial logistic regression

Numerical Roundoff Errors

When we compute loss in , we use the intermediate value of . However, specifying this term in the loss function leads to more numerically accurate output layer:

Note that in this case, linear activation function is used for constructing output layer.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.losses import SparseCategoricalCrossentropy
from tensorflow.keras.optimizers import Adam
 
model = Sequential(
    [ 
	    #tf.keras.Input(shape=(400,)),
        Dense(25, activation = 'relu'),
        Dense(15, activation = 'relu'),
        Dense(4, activation = 'linear')   #<-- Note
    ]
)
model.compile(
    loss = SparseCategoricalCrossentropy(from_logits=True),  #<-- Note
    optimizer = Adam(0.001),
)
 
model.fit(
    X_train,y_train,
    epochs=10
)
        
 
model.predict(X_pred)