Python 是否可以在PySpark中加载Scala/Spark PipelineModel?
假设您在Scala/Spark中训练a,通过以下方式保存:Python 是否可以在PySpark中加载Scala/Spark PipelineModel?,python,scala,apache-spark,pyspark,mlflow,Python,Scala,Apache Spark,Pyspark,Mlflow,假设您在Scala/Spark中训练a,通过以下方式保存: model.save("path_to_my_pipeline_model") 出于某种原因,您希望通过以下方式将此模型加载到PySpark中: from pyspark.ml import Pipeline from pyspark.ml import PipelineModel my_model = PipelineModel.load("path_to_my_pipeline_model"
model.save("path_to_my_pipeline_model")
出于某种原因,您希望通过以下方式将此模型加载到PySpark中:
from pyspark.ml import Pipeline
from pyspark.ml import PipelineModel
my_model = PipelineModel.load("path_to_my_pipeline_model")
这可能吗?如果是这样的话,我是否遗漏了上面的内容
我有一个例外,我认为它没有提供很多有意义的信息,但如果它有助于我粘贴在这里:
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
<ipython-input-15-a07e2ca0900d> in <module>
----> 1 my_model = PipelineModel.load("path_to_my_pipeline_model")
~/anaconda3/envs/env_/lib/python3.8/site-packages/pyspark/ml/util.py in load(cls, path)
328 def load(cls, path):
329 """Reads an ML instance from the input path, a shortcut of `read().load(path)`."""
--> 330 return cls.read().load(path)
331
332
~/anaconda3/envs/env_/lib/python3.8/site-packages/pyspark/ml/pipeline.py in load(self, path)
287
288 def load(self, path):
--> 289 metadata = DefaultParamsReader.loadMetadata(path, self.sc)
290 if 'language' not in metadata['paramMap'] or metadata['paramMap']['language'] != 'Python':
291 return JavaMLReader(self.cls).load(path)
...
~/anaconda3/envs/env_/lib/python3.8/site-packages/pyspark/java_gateway.py in launch_gateway(conf, popen_kwargs)
103
104 if not os.path.isfile(conn_info_file):
--> 105 raise Exception("Java gateway process exited before sending its port number")
106
107 with open(conn_info_file, "rb") as info:
Exception: Java gateway process exited before sending its port number
---------------------------------------------------------------------------
异常回溯(最后一次最近调用)
在里面
---->1 my\u model=PipelineModel.load(“路径\u到我的\u管道\u model”)
加载中的~/anaconda3/envs/env_/lib/python3.8/site-packages/pyspark/ml/util.py(cls,路径)
328 def加载(cls,路径):
329“”从输入路径读取一个ML实例,该路径是“read().load(path)”的快捷方式
-->330返回cls.read().load(路径)
331
332
加载中的~/anaconda3/envs/env\ulib/python3.8/site-packages/pyspark/ml/pipeline.py(self,path)
287
288 def加载(自身,路径):
-->289 metadata=DefaultParamsReader.loadMetadata(路径,self.sc)
290如果“语言”不在元数据['paramMap']或元数据['paramMap']['language']中!='Python':
291返回JavaMLReader(self.cls).load(路径)
...
启动网关(conf,popen_-kwargs)中的~/anaconda3/envs/env_/lib/python3.8/site-packages/pyspark/java_-gateway.py
103
104如果不是os.path.isfile(连接信息文件):
-->105引发异常(“Java网关进程在发送端口号之前退出”)
106
107打开(连接信息文件“rb”)作为信息:
异常:Java网关进程在发送其端口号之前退出