Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/318.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用自定义距离度量保存sklearn模型_Python_Python 3.x_Machine Learning_Scipy_Scikit Learn - Fatal编程技术网

Python 使用自定义距离度量保存sklearn模型

Python 使用自定义距离度量保存sklearn模型,python,python-3.x,machine-learning,scipy,scikit-learn,Python,Python 3.x,Machine Learning,Scipy,Scikit Learn,我已经建立了一个带有自定义距离度量的knn模型,即余弦距离: def cosine_distance(x,y): x_module = np.sqrt(np.sum(x**2)) y_module = np.sqrt(np.sum(y**2)) return 1-np.dot(x,y)/(x_module*y_module) # load data x_feature = load_npz('data/movie_features.npz').toarray() mov

我已经建立了一个带有自定义距离度量的knn模型,即余弦距离:

def cosine_distance(x,y):
    x_module = np.sqrt(np.sum(x**2))
    y_module = np.sqrt(np.sum(y**2))
    return 1-np.dot(x,y)/(x_module*y_module)

# load data
x_feature = load_npz('data/movie_features.npz').toarray()
movies = CSVHelper.read_movie('data/IMDB_Movies_Master_data.csv')

neigh = NearestNeighbors(n_neighbors=5, metric=cosine_distance)
neigh.fit(x_feature)

# save the k-means model
joblib.dump(neigh, 'knn.pkl')
现在在第二个脚本中,我使用
joblib
加载模型:

knn_classifier = joblib.load('knn.pkl')
但是,它会引发以下错误:

File "<stdin>", line 1, in <module>
  File "/home/A/deeplearning_env/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 578, in load
    obj = _unpickle(fobj, filename, mmap_mode)
  File "/home/A/deeplearning_env/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 508, in _unpickle
    obj = unpickler.load()
  File "/usr/lib/python3.5/pickle.py", line 1039, in load
    dispatch[key[0]](self)
  File "/usr/lib/python3.5/pickle.py", line 1334, in load_global
    klass = self.find_class(module, name)
  File "/usr/lib/python3.5/pickle.py", line 1388, in find_class
    return getattr(sys.modules[module], name)
AttributeError: module '__main__' has no attribute 'cosine_distance'
文件“”,第1行,在
文件“/home/A/deeplearning_env/lib/python3.5/site packages/sklearn/externals/joblib/numpy_pickle.py”,第578行,已加载
obj=_unpickle(fobj,文件名,mmap_模式)
文件“/home/A/deeplearning_env/lib/python3.5/site packages/sklearn/externals/joblib/numpy_pickle.py”,第508行,在_unpickle中
obj=取消勾选器加载()
文件“/usr/lib/python3.5/pickle.py”,第1039行,已加载
分派[键[0]](自身)
文件“/usr/lib/python3.5/pickle.py”,第1334行,在load\u global中
klass=self.find_类(模块,名称)
文件“/usr/lib/python3.5/pickle.py”,第1388行,在find_类中
返回getattr(sys.modules[module],name)
AttributeError:模块“\uuuuu main\uuuuuuuuuuuuuuuuuuuuu”没有属性“余弦距离”

如何告诉
joblib
我正在使用自定义度量?我尝试在同一脚本中添加函数
余弦距离
,但它不起作用。

您需要在加载文件的脚本中定义该函数或导入该函数。您可以在这里查看替代方案:您可以共享保存并加载pickle模型的两个脚本吗?事实证明,这可能是django框架的问题。事实上,我正在构建一个django web应用程序,用于加载模型的代码放在
views.py
中。我在脚本
views.py
中添加了函数
cosine\u distance
,但它不起作用。但是,在将函数添加到脚本
manage.py
时,它可以工作。有没有更好的方法来解决这个问题?