Python 回归问题的神经网络构建问题
在构建用于预测视频游戏销售的ANN时,我遇到了准确性和损失的问题。损失非常高,为4.3,精度保持在0。任何帮助都将不胜感激Python 回归问题的神经网络构建问题,python,machine-learning,keras,neural-network,Python,Machine Learning,Keras,Neural Network,在构建用于预测视频游戏销售的ANN时,我遇到了准确性和损失的问题。损失非常高,为4.3,精度保持在0。任何帮助都将不胜感激 import tensorflow as tf import pandas as pd import numpy as np from tensorflow.keras.layers import Dense, Dropout from tensorflow.keras.models import Model dataset = pd.read_csv('Train.cs
import tensorflow as tf
import pandas as pd
import numpy as np
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.models import Model
dataset = pd.read_csv('Train.csv')
#dropping one outlier
dataset = dataset.drop(dataset[(dataset['SalesInMillions']>60)].index)
X = dataset.iloc[:,3:8]
Y = dataset['SalesInMillions'].values
dataset.drop('SalesInMillions', axis=1, inplace=True)
#getting dummy variables for categorical values - Rating, Category
print(dataset.shape) #pre-dummies shape
dataset = pd.get_dummies(data=dataset, columns=['CATEGORY', 'RATING'])
print(dataset.shape) #post-dummies shape
dataset.head() #Check to verify that dummies are ok
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
dataset['le_publisher'] = le.fit_transform(dataset['PUBLISHER'])
dataset.head()
X = dataset.iloc[:,4:]
"""#Model Building"""
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,Y, test_size=0.33, random_state=42)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
y_train = y_train.reshape(-1, 1)
y_test = y_test.reshape(-1, 1)
model = tf.keras.models.Sequential([
Dense(32, input_shape=X_train[0].shape, activation='relu'),
Dense(64, activation='relu'),
Dense(128, activation='relu'),
Dense(1)
])
model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
model.summary()
r = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)
结果:
>Epoch 1/10
74/74 [==============================] - 0s 3ms/step - loss: 5.3924 - accuracy: 0.0000e+00 - val_loss: 3.1689 - val_accuracy: 0.0000e+00
>Epoch 2/10
74/74 [==============================] - 0s 3ms/step - loss: 4.7189 - accuracy: 0.0000e+00 - val_loss: 3.1634 - val_accuracy: 0.0000e+00
>Epoch 3/10
74/74 [==============================] - 0s 3ms/step - loss: 4.6166 - accuracy: 0.0000e+00 - val_loss: 3.0874 - val_accuracy: 0.0000e+00
>Epoch 4/10
74/74 [==============================] - 0s 2ms/step - loss: 4.5860 - accuracy: 0.0000e+00 - val_loss: 3.0585 - val_accuracy: 0.0000e+00
>Epoch 5/10
74/74 [==============================] - 0s 2ms/step - loss: 4.5070 - accuracy: 0.0000e+00 - val_loss: 3.1005 - val_accuracy: 0.0000e+00
精度
不是回归问题的有效指标,因为它寻找实际值和预测值之间的完美匹配。如果您的样本如下所示:
<tf.Tensor: shape=(3, 3, 3), dtype=float32, numpy=
array([[[9.522715 , 6.7740774 , 7.953182 ],
[7.5578175 , 4.759556 , 6.3101482 ],
[1.8602037 , 1.1430776 , 3.3622181 ]],
[[7.2333503 , 2.1919966 , 8.573376 ],
[8.239203 , 5.9541273 , 0.02962708],
[2.4725473 , 5.0607405 , 3.6158872 ]],
[[0.44838428, 9.721661 , 8.283884 ],
[4.1458406 , 6.0166597 , 3.3958685 ],
[5.731027 , 2.3625553 , 6.7478456 ]]], dtype=float32)>
完美匹配的几率为每百万分之一。连续损失函数或指标包括
mae
、mse
、mape
等。谢谢您的回复!我理解这一点,但这仍然不能解决为什么损失没有减少的问题。我把亏损改为“mae”,我得到的最好亏损是1.18。这不一定是个问题。神经网络并不总是完美的。如果您添加更多层并增加神经元数量,损失可能会减少,但可能会过度拟合。因为,您希望有一个回归值,所以请将最终层激活设置为“线性”,如果我忽略回归的激活函数,则会有关,因为默认值为“无”,并且不会更改该值。但我还是尝试了“线性”,没有帮助。