Python 回归问题的神经网络构建问题_Python_Machine Learning_Keras_Neural Network

Python 回归问题的神经网络构建问题

python machine-learning keras neural-network

Python 回归问题的神经网络构建问题,python,machine-learning,keras,neural-network,Python,Machine Learning,Keras,Neural Network,在构建用于预测视频游戏销售的ANN时，我遇到了准确性和损失的问题。损失非常高，为4.3，精度保持在0。任何帮助都将不胜感激 import tensorflow as tf import pandas as pd import numpy as np from tensorflow.keras.layers import Dense, Dropout from tensorflow.keras.models import Model dataset = pd.read_csv('Train.cs

在构建用于预测视频游戏销售的ANN时，我遇到了准确性和损失的问题。损失非常高，为4.3，精度保持在0。任何帮助都将不胜感激

import tensorflow as tf
import pandas as pd
import numpy as np
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.models import Model

dataset = pd.read_csv('Train.csv')
#dropping one outlier
dataset = dataset.drop(dataset[(dataset['SalesInMillions']>60)].index)


X = dataset.iloc[:,3:8]
Y = dataset['SalesInMillions'].values

dataset.drop('SalesInMillions', axis=1, inplace=True)

#getting dummy variables for categorical values - Rating, Category
print(dataset.shape) #pre-dummies shape
dataset = pd.get_dummies(data=dataset, columns=['CATEGORY', 'RATING'])
print(dataset.shape) #post-dummies shape
dataset.head() #Check to verify that dummies are ok

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
dataset['le_publisher'] = le.fit_transform(dataset['PUBLISHER'])
dataset.head()

X = dataset.iloc[:,4:]

"""#Model Building"""

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,Y, test_size=0.33, random_state=42)

print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

y_train = y_train.reshape(-1, 1)
y_test = y_test.reshape(-1, 1)

model = tf.keras.models.Sequential([
                                    Dense(32, input_shape=X_train[0].shape, activation='relu'),
                                    Dense(64, activation='relu'),
                                    Dense(128, activation='relu'),
                                    Dense(1)
])

model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])

model.summary()

r = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)

结果:

>Epoch 1/10
74/74 [==============================] - 0s 3ms/step - loss: 5.3924 - accuracy: 0.0000e+00 - val_loss: 3.1689 - val_accuracy: 0.0000e+00  
>Epoch 2/10
74/74 [==============================] - 0s 3ms/step - loss: 4.7189 - accuracy: 0.0000e+00 - val_loss: 3.1634 - val_accuracy: 0.0000e+00  
>Epoch 3/10
74/74 [==============================] - 0s 3ms/step - loss: 4.6166 - accuracy: 0.0000e+00 - val_loss: 3.0874 - val_accuracy: 0.0000e+00  
>Epoch 4/10
74/74 [==============================] - 0s 2ms/step - loss: 4.5860 - accuracy: 0.0000e+00 - val_loss: 3.0585 - val_accuracy: 0.0000e+00  
>Epoch 5/10
74/74 [==============================] - 0s 2ms/step - loss: 4.5070 - accuracy: 0.0000e+00 - val_loss: 3.1005 - val_accuracy: 0.0000e+00

精度

不是回归问题的有效指标，因为它寻找实际值和预测值之间的完美匹配。如果您的样本如下所示：

<tf.Tensor: shape=(3, 3, 3), dtype=float32, numpy=
array([[[9.522715  , 6.7740774 , 7.953182  ],
        [7.5578175 , 4.759556  , 6.3101482 ],
        [1.8602037 , 1.1430776 , 3.3622181 ]],
       [[7.2333503 , 2.1919966 , 8.573376  ],
        [8.239203  , 5.9541273 , 0.02962708],
        [2.4725473 , 5.0607405 , 3.6158872 ]],
       [[0.44838428, 9.721661  , 8.283884  ],
        [4.1458406 , 6.0166597 , 3.3958685 ],
        [5.731027  , 2.3625553 , 6.7478456 ]]], dtype=float32)>

完美匹配的几率为每百万分之一。连续损失函数或指标包括

mae

、

mse

、

mape

等。

谢谢您的回复！我理解这一点，但这仍然不能解决为什么损失没有减少的问题。我把亏损改为“mae”，我得到的最好亏损是1.18。这不一定是个问题。神经网络并不总是完美的。如果您添加更多层并增加神经元数量，损失可能会减少，但可能会过度拟合。因为，您希望有一个回归值，所以请将最终层激活设置为“线性”，如果我忽略回归的激活函数，则会有关，因为默认值为“无”，并且不会更改该值。但我还是尝试了“线性”，没有帮助。