Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/299.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 时间序列多类分类数据的损失和精度常数_Python_Machine Learning_Deep Learning_Time Series_Data Science - Fatal编程技术网

Python 时间序列多类分类数据的损失和精度常数

Python 时间序列多类分类数据的损失和精度常数,python,machine-learning,deep-learning,time-series,data-science,Python,Machine Learning,Deep Learning,Time Series,Data Science,我有可变长度的时间序列数据和多类分类。头部看起来像: 0 DR_24526 1 -0.261916 0.377803 1.617511 0.311707 -0.055292 0 0.740317 0 4 1.810690 -0.375699 -1.025374 0 0.806782 0.529635 -0.577077 1 DR_24526 1 0.484744 -0.262327 -

我有可变长度的时间序列数据和多类分类。头部看起来像:

0   DR_24526    1   -0.261916   0.377803    1.617511    0.311707    -0.055292   0   0.740317    0   4   1.810690    -0.375699   -1.025374   0   0.806782    0.529635    -0.577077
1   DR_24526    1   0.484744    -0.262327   -0.447281   -0.497518   -0.526008   0   0.740317    0   4   1.810690    -0.618167   -1.353477   0   0.806782    0.529635    -0.577077
2   DR_24526    1   0.484744    0.484492    2.415695    1.882432    -0.565707   0   0.740317    0   4   1.810690    -0.618167   -1.353477   0   0.806782    0.529635    -0.577077
3   DR_24526    2   0.058081    0.591180    -0.415251   -0.512043   0.131860    0   0.740317    0   4   1.810690    -0.618167   -1.353477   0   0.806782    0.529635    -0.577077
4   DR_24526    1   0.591409    0.484492    1.185172    2.287045    -0.350199   0   0.740317    0   4   1.810690    -0.618167   -1.353477   0   0.806782    0.529635    -0.577077
第一列是ID,其组具有不同的长度。我已经填充和截断,使他们的长度相等

sequences = list()

for name, group in tqdm(train_df.groupby(['ID'])):
    sequences.append(group.drop(columns=['ID']).values)

#Padding the sequence with the values in last row to max length
to_pad = 112
new_seq = []
for one_seq in sequences:
    len_one_seq = len(one_seq)
    last_val = one_seq[-1]
    n = to_pad - len_one_seq
   
    to_concat = np.repeat(one_seq[-1], n).reshape(17, n).transpose()
    new_one_seq = np.concatenate([one_seq, to_concat])
    new_seq.append(new_one_seq)
final_seq = np.stack(new_seq)

#truncate the sequence to length 60
# from tf.keras.preprocessing import sequence
seq_len = 16
final_seq=tf.keras.preprocessing.sequence.pad_sequences(final_seq, maxlen=seq_len, padding='post', dtype='float', truncating='post')
在另一个df中,有一个目标列,其中有3个类0、1、2,ID为相同数量的类

target = pd.get_dummies(train['DrivingStyle'])
target = np.asarray(target)
这是我的模型代码

model = tf.keras.models.Sequential()
model.add(L.Bidirectional(L.LSTM(64, dropout=0.2, input_shape=(seq_len, 17), return_sequences=True)))
model.add(L.Bidirectional(L.LSTM(64, dropout=0.2)))
model.add(L.Dense(3, activation='softmax'))

# adam = tf.optimizers.Adam(lr=0.1, clipvalue=0.5)
# adam = tf.keras.optimizers.Adam(lr=0.001, clipvalue=0.8)
# sgd = tf.keras.optimizers.SGD(lr=1)
sgd = tf.keras.optimizers.SGD(lr=1e-4, decay=1e-6, momentum=0.9, nesterov=True)

model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

model.fit(
    final_seq,
    target,
    epochs=10,
    batch_size=84,
    callbacks=[
        tf.keras.callbacks.ReduceLROnPlateau(patience=5),
    ]
)

But my loss and accuracy are levelling to a constant value

Epoch 1/10
155/155 [==============================] - 2s 11ms/step - loss: 1.1425 - accuracy: 0.3136
Epoch 2/10
155/155 [==============================] - 2s 11ms/step - loss: 1.0670 - accuracy: 0.4461
Epoch 3/10
155/155 [==============================] - 2s 11ms/step - loss: 1.0505 - accuracy: 0.4810
Epoch 4/10
155/155 [==============================] - 2s 10ms/step - loss: 1.0463 - accuracy: 0.4882
Epoch 5/10
155/155 [==============================] - 2s 11ms/step - loss: 1.0451 - accuracy: 0.4889
Epoch 6/10
155/155 [==============================] - 2s 14ms/step - loss: 1.0437 - accuracy: 0.4904
Epoch 7/10
155/155 [==============================] - 2s 11ms/step - loss: 1.0438 - accuracy: 0.4905
Epoch 8/10
155/155 [==============================] - 2s 11ms/step - loss: 1.0426 - accuracy: 0.4920
Epoch 9/10
155/155 [==============================] - 2s 13ms/step - loss: 1.0433 - accuracy: 0.4911
Epoch 10/10
155/155 [==============================] - 2s 11ms/step - loss: 1.0419 - accuracy: 0.4909
在类似的问题上,我尝试过其他的解决方法。我尝试了3个包含256个节点的隐藏LSTM层,但都不起作用

数据形状

print(final_seq.shape)
print(target.shape)
(12994, 16, 17)
(12994, 3)

更新答案 因此,您的数据形状看起来不错。有些事情,我会改变,这可能会改善结果:

  • 将批次大小降低到16或32,因为较低的批次大小可以提高准确性
  • 再次使用Adam作为优化器,我不知道您的自定义SGD设置是否对模型产生了不良影响
  • 在数据进入模型之前检查数据。可能有一些预处理问题,我们在这里看不到
  • 也许,你所拥有的数据不足以进行正确的分类。如果一个类导致了问题,您可以考虑减少要测试的类数量,因为它的特性与其他类混合在一起
  • 由于某些功能在一段时间内不会改变,您可以尝试使用CNN。可能并非所有的特征都与此分类相关,因此使用较少的特征可能更好

  • 为什么要为LSTM添加双向层?另外,输入数据的形状是什么?您确定,您的数据点足以使此分类正常工作吗?尝试了双向和单向。。同样的结果在训练之前标注了数据形状
    target=pd.get_dummies(训练['DrivingStyle'])target=np.asarray(目标)
    这将它们转换为一个hot。如上所述