同学你好,不知道你具体说的是哪一段代码,不过我猜测是下面这段:
# Sets up the fully-connected layer.
fc_init = tf.uniform_unit_scaling_initializer(factor=1.0)
with tf.variable_scope('fc', initializer=fc_init):
rnn_outputs_2d = tf.reshape(rnn_outputs, [-1, hps.num_lstm_nodes[-1]])
fc1 = tf.layers.dense(rnn_outputs_2d, hps.num_fc_nodes, name='fc1')
fc1_dropout = tf.contrib.layers.dropout(fc1, keep_prob)
fc1_dropout = tf.nn.relu(fc1_dropout)
logits = tf.layers.dense(fc1_dropout, vocab_size, name='logits')
with tf.variable_scope('loss'):
sentence_flatten = tf.reshape(sentence, [-1])
mask_flatten = tf.reshape(mask, [-1])
mask_sum = tf.reduce_sum(mask_flatten)
softmax_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=logits, labels=sentence_flatten)
weighted_softmax_loss = tf.multiply(softmax_loss,
tf.cast(mask_flatten, tf.float32))
prediction = tf.argmax(logits, 1, output_type = tf.int32)
correct_prediction = tf.equal(prediction, sentence_flatten)
correct_prediction_with_mask = tf.multiply(
tf.cast(correct_prediction, tf.float32),
mask_flatten)
accuracy = tf.reduce_sum(correct_prediction_with_mask) / mask_sum
loss = tf.reduce_sum(weighted_softmax_loss) / mask_sum
tf.summary.scalar('loss', loss)这一段里,rnn_outputs_2d的shape是[batch_size * num_timestamps, hps.num_lstm_nodes[-1]], 是二维矩阵,经过全连接层后只有第二维发生变化,变成:[batch_size * num_timestapms, hps.num_fc_nodes], 也是二维矩阵。
sentence本来的shape是[batch_size, num_timestamps],展开后变成长度为[batch_size * num_timestamps]的一维向量。