会发生变化,在apply_gradients的时候:
with tf.GradientTape() as tape:
y_pred = model(x_batch)
y_pred = tf.squeeze(y_pred, 1)
loss = keras.losses.mean_squared_error(y_batch, y_pred)
metric(y_batch, y_pred)
grads = tape.gradient(loss, model.variables)
grads_and_vars = zip(grads, model.variables)
optimizer.apply_gradients(grads_and_vars)
grads_and_vars存储的是每个变量variable和变量对应的梯度gradient,apply_gradients则会更新变量:
variable = variable - learning_rate * gradient.
w和b都是variable,所以都会被更新。