2.4 Hyperparameter Optimization Question

Hey Jordan, looking at problem 2.4, how do you want us to implement the neural network? Do you want us to use:

**Method #1**
model = Sequential()
model.add(Dense(256, activation='relu',input_shape=(784,)))
model.compile(loss=keras.losses.categorical_crossentropy,optimizer=keras.optimizers.Adadelta(),metrics=['accuracy'])
model.fit(x_train, y_train,batch_size=32,epochs=10,verbose=1,validation_data=(x_test, y_test))

**Method #2**
alpha = 0.01                                              # set learning rate
theta_1 = np.random.normal(0,.1,size=(2,3)); b1 = np.zeros((1,3))   # init weights
theta_2 = np.random.normal(0,.1,size=(3,2)); b2 = np.zeros((1,2))       

J = []                                                    
for i in range(10000):
    l1 = relu(np.dot(X, theta_1) + b1)                     # l1 = X * theta_1
    y_hat = softmax(np.dot(l1, theta_2) + b2)              # Y_hat = l1 * theta_2 + b

    cost = np.sum( - (Y * np.log(y_hat) + (1 - Y) * np.log(1 - y_hat)) )
    J.append(cost)                                         # store cost
    
    dJ_dZ2 = d_softmax(y_hat,Y)                            
    dJ_dtheta2 = np.dot(l1.T, dJ_dZ2)                      # compute gradients
    dJ_db2 = np.sum(dJ_dZ2, axis=0, keepdims=True)
    
    dJ_dZ1 = np.dot(dJ_dZ2, theta_2.T) * d_relu(l1)
    dJ_db1 = np.sum(dJ_dZ1, axis=0, keepdims=True)
    
    theta_2 -= alpha * dJ_dtheta2                         # weight update
    b2 -= alpha * dJ_db2
    theta_1 -= alpha * np.dot(X.T, dJ_dZ1)
    b1 -= alpha * dJ_db1
   
    if J[-1] == 0 or J[-1] > 10: break

The issue with method #1 is that you can't implement the learning rate portion that you wanted us to use, so I am assuming it's method #2 but I wanted to clarify with you. Please let me know. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.4 Hyperparameter Optimization Question #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

2.4 Hyperparameter Optimization Question #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions