我不知道';t在Python中使用Scikit Learn with MLeap成功保存(序列化)zip文件
我试过:我不知道';t在Python中使用Scikit Learn with MLeap成功保存(序列化)zip文件,python,scikit-learn,mleap,Python,Scikit Learn,Mleap,我试过: #Generate data import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(100, 5), columns=['a', 'b', 'c', 'd', 'e']) df["y"] = (df['a'] > 0.5).astype(int) df.head() from mleap.sklearn.ensemble.forest import RandomForestClass
#Generate data
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(100, 5), columns=['a', 'b', 'c', 'd', 'e'])
df["y"] = (df['a'] > 0.5).astype(int)
df.head()
from mleap.sklearn.ensemble.forest import RandomForestClassifier
forestModel = RandomForestClassifier()
forestModel.mlinit(input_features='a',
feature_names='a',
prediction_column='e_binary')
forestModel.fit(df[['a']], df[['y']])
forestModel.serialize_to_bundle("jar:file:/dbfs/FileStore/tables/mleaptestmodelforestpysparkzip", "randomforest.zip")
我得到了这个错误:
没有这样的文件或目录:'jar:file:/dbfs/FileStore/tables/mleaptestmodelforestpysparkzip/randomforest.zip.node'
我也试过了:forestModel.serialize\u to\u bundle(“jar:file:/dbfs/FileStore/tables/mleaptestmodelsforestpysparkzip/randomforest.zip”)
并得到一个错误,表示“model_name”属性丢失
你能帮我吗
我添加了我尝试做的所有事情和得到的结果: 要压缩的管道: 一,
pipeline.序列化到捆绑包(“jar:file:/dbfs/FileStore/tables/mleap/pipeline\u zip/1/model.zip”,model\u name=“forest”)
=>FileNotFoundError:[Errno 2]没有这样的文件或目录:“jar:file:/dbfs/FileStore/tables/mleap/pipeline\u zip/1/model.zip/model.json”
二,
pipeline.序列化到捆绑包(“jar:file:/dbfs/FileStore/tables/mleap/pipeline\u zip/1/model.zip”,model\u name=“forest”,init=True)
FileNotFoundError:[Errno 2]没有这样的文件或目录:“jar:file:/dbfs/FileStore/tables/mleap/pipeline\u zip/1/model.zip/forest”
三,
pipeline.serialize\u to\u bundle(“jar:file:/dbfs/FileStore/tables/mleap/pipeline\u zip/1/model.zip”,model\u name=“forest”,init=True)
创建“/dbfs/FileStore/tables/mleap/pipeline_zip/1/model.zip/forest”
=>FileNotFoundError:[Errno 2]没有这样的文件或目录:“jar:file:/dbfs/FileStore/tables/mleap/pipeline\u zip/1/model.zip/forest”
四,
pipeline.serialize\u to\u bundle(“/dbfs/FileStore/tables/mleap/pipeline\u zip/1/model.zip”,model\u name=“forest”,init=True)
=>FileNotFoundError:[Errno 2]没有这样的文件或目录:'/dbfs/FileStore/tables/mleap/pipeline\u zip/1/model.zip/forest'
五,
pipeline.serialize\u to\u bundle(“/dbfs/FileStore/tables/mleap/pipeline\u zip/1/model.zip”,model\u name=“forest”,init=True)
=>OSError:[Errno 95]操作不受支持-但保存某些内容
pipeline.serialize\u to\u bundle(“jar:dbfs:/dbfs/FileStore/tables/mleap/pipeline\u zip/1/model.zip”,model\u name=“forest”,init=True)
pipeline.serialize\u to\u bundle(“jar:dbfs:/FileStore/tables/lifttruck\u mleap/pipeline\u zip2/1/model.zip”,model\u name=“forest”,init=True)
=>FileNotFoundError:[Errno 2]没有这样的文件或目录:“jar:dbfs:/FileStore/tables/mleap/pipeline\u zip/1/model.zip/forest”
八,
pipeline.serialize\u to\u bundle(“dbfs:/FileStore/tables/lifttruck\u mleap/pipeline\u zip2/1/model.zip”,model\u name=“forest”,init=True)
=>FileNotFoundError:[Errno 2]没有这样的文件或目录:“dbfs:/FileStore/tables/mleap/pipeline\u zip2/1/model.zip/forest”
要压缩的型号
forest.serialize\u to\u bundle(“jar:file:/dbfs/FileStore/tables/mleap/random\u forest\u zip/1/model.zip”,model\u name=“forest”)
forest.serialize\u to\u bundle(“jar:file:/dbfs/FileStore/tables/mleap/random\u forest\u zip/1”,model\u name=“model.zip”)
forest.serialize\u to\u bundle(“/dbfs/FileStore/tables/mleap/random\u forest\u zip/1”,model\u name=“model.zip”)
=>不要保存拉链。改为保存一个捆绑包。我发现了问题和解决方法 不再可能使用Databrick进行随机写入,如下所述: 一种解决方法是在本地文件系统中写入zip文件,然后将其复制到DBFS中。因此:
dbutils.fs.cp(源、目标)我找到了问题和解决方法 不再可能使用Databrick进行随机写入,如下所述: 一种解决方法是在本地文件系统中写入zip文件,然后将其复制到DBFS中。因此: