Python Scikit learn GridSearchCV-为什么在执行grid.fit（）时收到数据类型错误？_Python_Machine Learning_Scikit Learn_Gridsearchcv

Python Scikit learn GridSearchCV-为什么在执行grid.fit（）时收到数据类型错误？

python machine-learning scikit-learn

Python Scikit learn GridSearchCV-为什么在执行grid.fit（）时收到数据类型错误？,python,machine-learning,scikit-learn,gridsearchcv,Python,Machine Learning,Scikit Learn,Gridsearchcv,我一直在用python做一个机器学习项目。在基本神经网络运行良好后，我尝试使用sklearn中的GridSearchCV函数设置网格搜索以优化参数。grid.fit（X，Y）函数抛出此错误：TypeError:只有大小为1的数组才能转换为Python标量。我的解释是fit函数不喜欢我给出的X和Y的格式。这让我很困惑，因为在没有网格搜索的情况下，网络运行得很好，而且我根本没有弄乱网络或数据。有人能解释一下这里发生了什么，我怎样才能解决它吗此代码创建网络和网格搜索： #Creating the n

我一直在用python做一个机器学习项目。在基本神经网络运行良好后，我尝试使用

sklearn

中的

GridSearchCV

函数设置网格搜索以优化参数。

grid.fit（X，Y）

函数抛出此错误：

TypeError:只有大小为1的数组才能转换为Python标量。我的解释是fit函数不喜欢我给出的X
和Y
的格式。这让我很困惑，因为在没有网格搜索的情况下，网络运行得很好，而且我根本没有弄乱网络或数据。有人能解释一下这里发生了什么，我怎样才能解决它吗
此代码创建网络和网格搜索：
#Creating the neural network
def create_model():
  model=Sequential()

  model.add(Dense(512, activation='relu',input_shape=(2606,)))
  model.add(Dense(256, activation='relu'))
  model.add(Dense(128, activation='relu'))
  model.add(Dense(64, activation='relu'))
  model.add(Dense(32, activation='relu'))
  model.add(Dense(16, activation='relu'))
  model.add(Dense(1, activation='relu'))

  opt=optimizers.Adam(lr=learn_rate)
  model.compile(optimizer=opt, loss='mean_squared_error', metrics=['accuracy'])

  #I commented this out because I believe it is delegated to the grid.fit() fn later on.
  #model.fit(X_train, Y_train, batch_size=30, epochs=6000, verbose=1)

  return model

#Now setting up the grid search
model=KerasClassifier(build_fn=create_model())

learn_rate=np.arange(.00001,.001,.00002).tolist()
batch_size=np.arange(10,2606,2).tolist()
epochs=np.arange(1000,10000,100).tolist()

param_grid=dict(learn_rate=learn_rate, batch_size=batch_size, epochs=epochs)

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

grid_results=grid.fit(X_train,Y_train) #This is the line referenced in the error message.

print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

任何建议都将不胜感激
编辑：
X\U列车
数据具有形状（1672606）
。167个元素中的每一个都是一个长度为2606的数组。这就是为什么网络的输入形状是（2606，）
。Y\u列车
具有形状（167，）
，因此，问题在于GridSearchCV
为其每种组合创建了一个新模型，其中包含新参数。您正在传递已创建的模型和参数列表。我认为这是数组与标量错误的根源。下面，我修改了您将要运行的代码（带有一些垃圾样本数据）
需要注意的主要更改是我更改了create\u model
的签名，以接受传递到GridSearch的参数值。我还删除了KerasClassifier
实例对变量model
的赋值，并将该调用作为估计器放入GridSearchCV

import numpy as np
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import optimizers
from sklearn.model_selection import GridSearchCV


#Creating the neural network
def create_model(learn_rate, batch_size, epochs):
    model=Sequential()

    model.add(Dense(512, activation='relu',input_shape=(2606,)))
    model.add(Dense(256, activation='relu'))
    model.add(Dense(128, activation='relu'))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(16, activation='relu'))
    model.add(Dense(1, activation='relu'))

    opt=optimizers.Adam(lr=learn_rate)
    model.compile(optimizer=opt, loss='mean_squared_error', metrics=['accuracy'])

    #I commented this out because I believe it is delegated to the grid.fit() fn later on.
    #model.fit(X_train, Y_train, batch_size=30, epochs=6000, verbose=1)

    return model

#Now setting up the grid search
X_train = np.empty((167,2606), dtype=float, order='C')
Y_train = np.empty((167,), dtype=float, order='C')

learn_rate=np.arange(.00001,.001,.00002).tolist()
batch_size=np.arange(10,2606,2).tolist()
epochs=np.arange(1000,10000,100).tolist()

param_grid=dict(learn_rate=learn_rate, batch_size=batch_size, epochs=epochs)

grid = GridSearchCV(estimator=KerasClassifier(build_fn=create_model), 
param_grid=param_grid, n_jobs=-1, cv=3)

grid_results=grid.fit(X_train,Y_train) #This is the line referenced in the error message.

print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

您的X\u列
和Y\u列
输入的形状是什么？您收到的错误是因为在fit（）
的引擎盖下，在某个点上，它需要一个标量值，而不是接收一个数组。我添加了形状作为问题的编辑。我想知道它在哪里期望但没有接收到标量。我假设问题出在fit（）
只是接收数据。我没有像你所建议的那样考虑进一步执行死刑。谢谢你！我肯定错过了一个事实，那就是我忘了给create\u model（）
提供参数以接受。它现在似乎已经开始运行了。我希望它会花很长时间，所以我也会尝试在它完成时提供更新。再次感谢！几周后，我已经能够让GridSearch完全运行。学到的最大教训是，如果你对参数网格太过雄心勃勃，可能需要非常非常长的时间。我使用的是谷歌Colab GPU，但前10次尝试太过雄心勃勃，无法在合理的时间内执行。这当然是一个值得学习的宝贵教训！谢谢你的更新。我很高兴你能让它运行起来。别忘了标记接受的答案，让未来的用户知道它是可靠的。我之前一直在想怎么做（noob，你能告诉我吗？），不过我刚刚找到了按钮！谢谢