Python Can';t获取属性';类名';在模块上'__主&';来自气流

Python Can';t获取属性';类名';在模块上'__主&';来自气流,python,pickle,airflow,Python,Pickle,Airflow,我正在使用来编排我的数据管道。在其中一个任务中,我试图从S3加载一个pickled对象(RouteModelinstance): 这给了我一个错误: Traceback (most recent call last): File "/Users/cyrusghazanfar/Desktop/startup-studio/pilota_project/pilota_ml/env/lib/python3.6/site-packages/airflow/models/taskinstan

我正在使用来编排我的数据管道。在其中一个任务中,我试图从S3加载一个pickled对象(
RouteModel
instance):

这给了我一个错误:

Traceback (most recent call last):
  File "/Users/cyrusghazanfar/Desktop/startup-studio/pilota_project/pilota_ml/env/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 926, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/Users/cyrusghazanfar/Desktop/startup-studio/pilota_project/pilota_ml/env/lib/python3.6/site-packages/airflow/operators/python_operator.py", line 113, in execute
    return_value = self.execute_callable()
  File "/Users/cyrusghazanfar/Desktop/startup-studio/pilota_project/pilota_ml/env/lib/python3.6/site-packages/airflow/operators/python_operator.py", line 118, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/Users/cyrusghazanfar/Desktop/startup-studio/pilota_project/pilota_ml/inference/predict.py", line 43, in get_pred_for_flight
    pred_state, pred_state_prob, pred_dt = tst_pipeline.get_prediction(format_pred_od)
  File "/Users/cyrusghazanfar/Desktop/startup-studio/pilota_project/pilota_ml/inference/pipeline.py", line 174, in get_prediction
    route_model = self.rm_loader.get_model(self.rm_dict[r_key]['rm_key'])
  File "/Users/cyrusghazanfar/Desktop/startup-studio/pilota_project/pilota_ml/inference/dataloader.py", line 40, in get_model
    route_model = read_file_from_s3(self.loc, fname)
  File "/Users/cyrusghazanfar/Desktop/startup-studio/pilota_project/pilota_ml/inference/dataloader.py", line 96, in read_file_from_s3
    data = pickle.loads(buffer.read())
AttributeError: Can't get attribute 'RouteModel' on <module '__main__' from '/Users/cyrusghazanfar/Desktop/startup-studio/pilota_project/pilota_ml/env/bin/airflow'>
回溯(最近一次呼叫最后一次):
文件“/Users/cyrusghazanfar/Desktop/startup studio/pilota_project/pilota_ml/env/lib/python3.6/site packages/afflow/models/taskinstance.py”,第926行,在“运行”原始任务中
结果=任务\复制.执行(上下文=上下文)
文件“/Users/cyrusghazanfar/Desktop/startup studio/pilota_project/pilota_ml/env/lib/python3.6/site packages/afflow/operators/python_operator.py”,第113行,执行
return\u value=self.execute\u callable()
文件“/Users/cyrusghazanfar/Desktop/startup studio/pilota_project/pilota_ml/env/lib/python3.6/site packages/afflow/operators/python_operator.py”,第118行,可调用
返回self.python_可调用(*self.op_参数,**self.op_参数)
文件“/Users/cyrusghazanfar/Desktop/startup studio/pilota_project/pilota_ml/interference/predict.py”,第43行,在get_pred_for_flight中
pred_state,pred_state_prob,pred_dt=tst_管道。获取预测(格式为pred_od)
文件“/Users/cyrusghazanfar/Desktop/startup studio/pilota_project/pilota_ml/interference/pipeline.py”,第174行,在get_prediction中
route_model=self.rm_loader.get_model(self.rm_dict[r_key]['rm_key']))
get_模型中的文件“/Users/cyrusghazanfar/Desktop/startup studio/pilota_project/pilota_ml/inference/dataloader.py”,第40行
route_model=从_s3读取_文件(self.loc,fname)
文件“/Users/cyrusghazanfar/Desktop/startup studio/pilota_project/pilota_ml/interference/dataloader.py”,第96行,从_s3读取
data=pickle.load(buffer.read())
AttributeError:无法在上获取属性“RouteModel”
使用自定义类时,被pickle的类必须出现在读取pickle的进程的名称空间中,在本例中,名称空间为

注:

我不能改变我处理文件的方式


请提供帮助:)

为了解决这个问题,我需要编写自己的自定义取消勾选程序,其中我显式返回pickle文件引用的特定实例的自定义类:

class CustomUnpickler(pickle.Unpickler):

    def find_class(self, module, name):
        if name == 'RouteModel':
            from inference.route_model import RouteModel
            return RouteModel
        return super().find_class(module, name)

 data = CustomUnpickler(io.BytesIO(buffer.read())).load()

为了解决这个问题,我需要编写自己的自定义取消pickler,其中我显式返回pickle文件所引用的特定实例的自定义类:

class CustomUnpickler(pickle.Unpickler):

    def find_class(self, module, name):
        if name == 'RouteModel':
            from inference.route_model import RouteModel
            return RouteModel
        return super().find_class(module, name)

 data = CustomUnpickler(io.BytesIO(buffer.read())).load()