Python 为什么我的keras模型根本不训练?

Python 为什么我的keras模型根本不训练?,python,keras,neural-network,Python,Keras,Neural Network,我的代码是: from keras.models import Sequential from keras.layers import Dense, Dropout, Masking import numpy as np import pandas as pd dataset = pd.read_csv("data/train.csv", header=0) dataset = dataset.fillna(0) X = dataset.drop(columns=['YearRemodAdd

我的代码是:

from keras.models import Sequential
from keras.layers import Dense, Dropout, Masking
import numpy as np
import pandas as pd

dataset = pd.read_csv("data/train.csv", header=0)
dataset = dataset.fillna(0)

X = dataset.drop(columns=['YearRemodAdd', "Id", "SalePrice"], axis=1)
Y = dataset[['SalePrice']]

X = pd.get_dummies(X, columns=["MSSubClass", "MSZoning",
                               "Street", "Alley", "LotShape",
                               "LandContour", "Utilities", "LotConfig",
                               "LandSlope", "Neighborhood", "Condition1",
                               "Condition2", "BldgType", "HouseStyle",
                               "YearBuilt", "RoofStyle", "RoofMatl",
                               "Exterior1st", "Exterior2nd", "MasVnrType",
                               "ExterQual", "ExterCond", "Foundation",
                               "BsmtQual", "BsmtCond", "BsmtExposure",
                               "BsmtFinType1", "BsmtFinType2", "Heating",
                               "HeatingQC", "CentralAir", "Electrical",
                               "KitchenQual", "Functional", "FireplaceQu",
                               "GarageType", "GarageFinish", "GarageQual",
                               "GarageCond", "PavedDrive", "PoolQC",
                               "Fence", "MiscFeature", "MoSold",
                               "YrSold", "SaleType", "SaleCondition"])

Ymax = Y['SalePrice'].max()
Y = Y['SalePrice'].apply(lambda x: float(x) / Ymax)

input_units = X.shape[1]
print(X)
print(Y)

model = Sequential()
model.add(Dense(input_units, input_dim=input_units, activation='relu'))
model.add(Dense(input_units, activation='relu'))
model.add(Dense(input_units, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='adam', metrics=['mse'])
model.fit(X, Y, epochs=250, batch_size=50,
          shuffle=True, validation_split=0.05, verbose=2)

scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
我的数据如下:

Id,MSSubClass,MSZoning,LotFrontage,LotArea,Street,Alley,LotShape,LandContour,Utilities,LotConfig,LandSlope,Neighborhood,Condition1,Condition2,BldgType,HouseStyle,OverallQual,OverallCond,YearBuilt,YearRemodAdd,RoofStyle,RoofMatl,Exterior1st,Exterior2nd,MasVnrType,MasVnrArea,ExterQual,ExterCond,Foundation,BsmtQual,BsmtCond,BsmtExposure,BsmtFinType1,BsmtFinSF1,BsmtFinType2,BsmtFinSF2,BsmtUnfSF,TotalBsmtSF,Heating,HeatingQC,CentralAir,Electrical,1stFlrSF,2ndFlrSF,LowQualFinSF,GrLivArea,BsmtFullBath,BsmtHalfBath,FullBath,HalfBath,BedroomAbvGr,KitchenAbvGr,KitchenQual,TotRmsAbvGrd,Functional,Fireplaces,FireplaceQu,GarageType,GarageYrBlt,GarageFinish,GarageCars,GarageArea,GarageQual,GarageCond,PavedDrive,WoodDeckSF,OpenPorchSF,EnclosedPorch,3SsnPorch,ScreenPorch,PoolArea,PoolQC,Fence,MiscFeature,MiscVal,MoSold,YrSold,SaleType,SaleCondition,SalePrice
1,60,RL,65,8450,Pave,NA,Reg,Lvl,AllPub,Inside,Gtl,CollgCr,Norm,Norm,1Fam,2Story,7,5,2003,2003,Gable,CompShg,VinylSd,VinylSd,BrkFace,196,Gd,TA,PConc,Gd,TA,No,GLQ,706,Unf,0,150,856,GasA,Ex,Y,SBrkr,856,854,0,1710,1,0,2,1,3,1,Gd,8,Typ,0,NA,Attchd,2003,RFn,2,548,TA,TA,Y,0,61,0,0,0,0,NA,NA,NA,0,2,2008,WD,Normal,208500
2,20,RL,80,9600,Pave,NA,Reg,Lvl,AllPub,FR2,Gtl,Veenker,Feedr,Norm,1Fam,1Story,6,8,1976,1976,Gable,CompShg,MetalSd,MetalSd,None,0,TA,TA,CBlock,Gd,TA,Gd,ALQ,978,Unf,0,284,1262,GasA,Ex,Y,SBrkr,1262,0,0,1262,0,1,2,0,3,1,TA,6,Typ,1,TA,Attchd,1976,RFn,2,460,TA,TA,Y,298,0,0,0,0,0,NA,NA,NA,0,5,2007,WD,Normal,181500
我的结果是:

Epoch 123/250
 - 0s - loss: 3.8653 - mean_squared_error: 0.0687 - val_loss: 3.8064 - val_mean_squared_error: 0.0639
Epoch 124/250

大概过了两个时代,它就卡在那里了。我能做些什么来防止它如此迅速地陷入困境?

您似乎正在处理一个回归问题(即预测连续值)。至少有两件事你需要考虑:

  • 如@Mitiku在评论部分所述,数据中存在一些
    NA
    (即缺失)值。这是导致损失变成
    nan
    的原因之一。删除具有
    NA
    值的行,或者用特定值(如0)替换
    NA
    值。有关处理丢失数据的更多信息,请参阅

  • 使用
    准确度
    作为回归问题的度量没有意义,因为它仅对分类任务有效。相反,使用回归度量,如
    mse
    (即均方误差)或
    mae
    (即平均绝对误差)


  • 请在代码中应用以上两点,然后报告培训的进展情况,我将根据需要更新此答案。

    首先,您应该定义这是回归问题还是分类问题。然后你看一看你的目标变量,它们有很大的值(208500和181500),然后看输出的sigmoid激活,这意味着神经网络将预测[0,1]中的值。网络无法通过这种设置学习预测如此高的值。您需要规范化目标。@MatiasValdenegro我添加了规范化并编辑了上面的代码。相同的问题数据集中有一些非类型值。在将数据输入神经网络之前,请清除它们。@Mitiku我的数据中没有
    None
    有什么问题?更新为调整到
    mse
    并替换
    NA
    values@Shamoon那么,“卡住”是指精度没有提高?您是否尝试过添加更多层或增加现有层中的单元数?您可能还需要在执行此操作后添加正则化,以防止过度拟合。我已经添加了更多的层和单元。还是没有骰子。我甚至还加了一句dropout@Shamoon您需要考虑两点:1)添加大量图层并不一定会产生更好的模型。在设计模型和准备数据方面,您需要系统化。例如,一个好的方法是首先尝试一个非常基本的模型,比如一个简单的线性回归模型或一个只有一层的神经网络,然后看看它将如何执行。这将作为基线。然后,通过添加更多层或尝试不同的体系结构,逐步增加模型的容量,并将所有这些与基线进行比较,以评估您所做的每个>>@Shamoon>>更改的效果。2) 在这种情况下,您可能错误地建模了问题,需要以不同的方式对其进行表述,或者以不同的格式预处理和准备数据。此外,问题本身可能难以解决,因为甚至可能没有一个复杂的函数能够完美或公平地将输入映射到输出,例如股票市场预测。