Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/321.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Can';不要保存管道估计器_Python_Scikit Learn_Neural Network_Keras_Pipeline - Fatal编程技术网

Python Can';不要保存管道估计器

Python Can';不要保存管道估计器,python,scikit-learn,neural-network,keras,pipeline,Python,Scikit Learn,Neural Network,Keras,Pipeline,我正在尝试培训一个简单的管道: pipeline = Pipeline( [ ('scaler', StandardScaler()), ('deepnc', deepnc), ]) 其中,deepnc是Keras分类器: def create_spec_model(n_col=115, density_value=2, init='normal', dropout=0.2, learning_rate=0.005, decay=0.001,

我正在尝试培训一个简单的管道:

pipeline = Pipeline(
    [
        ('scaler', StandardScaler()),
        ('deepnc', deepnc),
    ])
其中,deepnc是Keras分类器:

def create_spec_model(n_col=115, density_value=2, init='normal', dropout=0.2, learning_rate=0.005, decay=0.001,
                      momentum=0.9):
    # create model
    model = Sequential()
    model.add(Dropout(dropout, input_shape=(n_col,)))
    model.add(Dense(50 * density_value, init=init, activation='relu', W_constraint=maxnorm(2),
                    W_regularizer=l1l2(l1=0, l2=1e-4)))
    model.add(Dropout(dropout))
    model.add(Dense(30 * density_value, init=init, activation='relu', W_constraint=maxnorm(2),
                    W_regularizer=l1l2(l1=0, l2=1e-4)))
    model.add(Dropout(dropout))
    model.add(Dense(1, init=init, activation='sigmoid'))
    # load weights
    try:
        model.load_weights(spec_model_path)
    except:
        pass
    # Compile model
    sgd = SGD(lr=learning_rate, momentum=momentum, decay=decay, nesterov=True)
    model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])
    return model
我将管道放入随机搜索并检查一些参数:

deepnc = KerasClassifier(build_fn=create_spec_model, validation_split=0.1, dropout=0.2, learning_rate=0.005,
                         decay=0.001, verbose=2)
# grid search epochs, batch size and optimizer
optimizers = ['adam']
init = ['uniform', 'normal']
epochs = np.array([20, 40])
batches = np.array([20, 50, 100])
learning_rate = [0.005, 0.01]
dropout = [0.2, 0.3, 0.5]
decay = [0, 0.001, 0.005, 0.01]
density_value = [1, 2, 4]
param_grid = dict(deepnc__nb_epoch=epochs, deepnc__batch_size=batches, deepnc__init=init, deepnc__dropout=dropout,
                  deepnc__learning_rate=learning_rate,
                  deepnc__density_value=density_value)
grid = RandomizedSearchCV(estimator=pipeline, param_distributions=param_grid, n_iter=100, cv=5, verbose=1,
                          scoring='accuracy', fit_params={'deepnc__callbacks': [earlyStopping, modelCheck]})

grid.fit(np.array(X_train.iloc[:, :115]), y_train)
joblib.dump(grid.best_estimator_, 'models/deepn_spec_model.pkl')
joblib.dump(grid.best_params_, 'models/deepn_spec_model_best_params.pkl')
之后,我想保存最佳估计器和最佳参数:

deepnc = KerasClassifier(build_fn=create_spec_model, validation_split=0.1, dropout=0.2, learning_rate=0.005,
                         decay=0.001, verbose=2)
# grid search epochs, batch size and optimizer
optimizers = ['adam']
init = ['uniform', 'normal']
epochs = np.array([20, 40])
batches = np.array([20, 50, 100])
learning_rate = [0.005, 0.01]
dropout = [0.2, 0.3, 0.5]
decay = [0, 0.001, 0.005, 0.01]
density_value = [1, 2, 4]
param_grid = dict(deepnc__nb_epoch=epochs, deepnc__batch_size=batches, deepnc__init=init, deepnc__dropout=dropout,
                  deepnc__learning_rate=learning_rate,
                  deepnc__density_value=density_value)
grid = RandomizedSearchCV(estimator=pipeline, param_distributions=param_grid, n_iter=100, cv=5, verbose=1,
                          scoring='accuracy', fit_params={'deepnc__callbacks': [earlyStopping, modelCheck]})

grid.fit(np.array(X_train.iloc[:, :115]), y_train)
joblib.dump(grid.best_estimator_, 'models/deepn_spec_model.pkl')
joblib.dump(grid.best_params_, 'models/deepn_spec_model_best_params.pkl')
出于某种原因,前者不起作用。幸运的是,我在控制台中运行了脚本,因此我能够单独运行后者并保存最佳参数。然而,我仍在试图找出如何保存模型。我想是将Keras的scikit包装器与管道和随机搜索CV结合起来导致了这个问题

我还尝试了以下代码:

path = 'models/deepn_spec_model.pkl'
pickle.dump(grid.best_estimator_, open(path, 'wb'))
但它生成了相同的错误回溯。我在下面贴了一个简短的版本,因为它是超长的,由同一个片段一次又一次的重复组成。用谷歌搜索错误没有帮助。有什么想法吗

  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 606, in save_list
    self._batch_appends(iter(obj))
  File "/usr/lib/python2.7/pickle.py", line 621, in _batch_appends
    save(x)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 568, in save_tuple
    save(element)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 655, in save_dict
    self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 669, in _batch_setitems
    save(v)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 754, in save_global
    (obj, module, name))
PicklingError: Can't pickle <function start_console_server at 0x7f0c22d08a28>: it's not found as __main__.start_console_server
文件“/usr/lib/python2.7/pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
保存列表中第606行的文件“/usr/lib/python2.7/pickle.py”
自批附录(iter(obj))
文件“/usr/lib/python2.7/pickle.py”,第621行,在批处理附录中
保存(x)
文件“/usr/lib/python2.7/pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
文件“/usr/lib/python2.7/pickle.py”,第568行,在save_tuple中
保存(元素)
文件“/usr/lib/python2.7/pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
保存目录中第655行的文件“/usr/lib/python2.7/pickle.py”
self.\u batch\u setitems(obj.iteritems())
文件“/usr/lib/python2.7/pickle.py”,第669行,在批处理设置项中
保存(v)
文件“/usr/lib/python2.7/pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
文件“/usr/lib/python2.7/pickle.py”,第754行,在save_global中
(对象、模块、名称))
PicklingError:无法pickle:找不到它作为\uuuuu main\uuuuuuuu.start\u console\u服务器

附加问题:我是否正确使用随机搜索?与我最初的努力相比,我没有得到太多的改进。

使用joblib或pickle无法保存keras模型。 而是使用save方法,如图所示

或者在您的情况下:

path = 'models/deepn_spec_model.pkl'
grid.best_estimator_.save(path)
并加载模型:

from keras.models import load_model
path = 'models/deepn_spec_model.pkl'
model = load_model(path)

另外请注意,由于严格来说这不是pickle文件,而是HDF5文件,因此您最好将文件扩展名更改为“.h5”。

我知道这不完全是您想要的,但是。。。这里什么都没有:

鉴于回调对象已保存了最佳权重,您只需要模型对象。模型在您的函数中
create\u spec\u model()
。您唯一需要的是最佳参数。因此:

# Save parameters of the best estimator.
pickle.dump(grid.best_estimator_.named_steps['deepnc'].get_params(),open('params.pkl','wb'))
加载时,假设函数
create\u spec\u model()
仍在代码中:

import inspect

def load_model(params_path, weights_path):
    params = pickle.load(open(params_path,'rb'))
    params = {k: params[k] for k in inspect.getargspec(create_spec_model)[0] if k in params.keys()}
    model = create_spec_model(**params)
    model.load_weights(weights_path)
    return model

这对您有帮助吗?

不可能保存包装在Scikit分类器中的Keras模型

但是,有一个解决方法(或2)!一种是使用ModelCheckpoint保存最佳权重并“重新创建”模型

在我的情况下,在做了一些研究之后,我决定做一些不同的事情,可以说更容易:在培训之后,简单地做:

grid.best_estimator_.model.save(path)
这(即,保存前添加.model)确保访问基础Keras模型,save方法对其正确工作。现在你可以简单地做了

deepnc_cont = keras.models.load_model(path)
而且它有效——至少对我来说是这样:)


请注意,如果出于某种原因,我需要一个KerasClassifier对象(scikit包装器),它将无法工作,因为它的构造函数需要一个生成模型的函数,所以我可能必须遵循Nassim的路线?不过我不确定。

只是为Keras Pickle问题添加了另一种解决方法

它有点零散,但允许Keras模型与
pickle
一起存储,因此它适用于模型和包含对象


不幸的是,这并不像那样容易:(实际上,grid.best_estimator是一个管道对象……所以我想问题真的是Keras想要保存而管道想要转储,并且没有办法协调这一点?作为一种解决方法,您可以从
命名的_步骤中获取所有的估计器('scaler'和'deepnc'))并尝试通过
dump
save
分别保存和酸洗它们。如何单独获取它们?管道不支持索引。调用
Pipeline.named_steps['deepnc'].save()
joblib.dump(Pipeline.named_steps['deepnc'],…path.)
。请参阅使用
命名步骤
属性来访问内部估计器。这现在很奇怪…grid.best\u estimator\uu.named\u steps['deepnc'].save(“models/deepn\u spec\u model.h5”)返回“AttributeError:'KerasClassifier'对象没有属性'save'。这很奇怪,因为我以前已经在这个项目中保存了模型。我放弃投票是因为它有效,而且为了努力,然而,在我看来,有一种更好的方法可以做到这一点,我会在稍后发布:)酷,迫不及待地想看它!好的,它在这里:)既然你已经投资研究这个问题,请让我知道你的想法?我在某处看到了。模型解决方案,但我认为它在你的情况下不起作用,因此“坏”黑客。很酷,您找到了访问keras模型的解决方案。