ImportError:没有名为tensorflow_transform.beam的模块

ImportError:没有名为tensorflow_transform.beam的模块,tensorflow,google-cloud-platform,google-cloud-dataflow,apache-beam,Tensorflow,Google Cloud Platform,Google Cloud Dataflow,Apache Beam,向GCP提交数据流作业时,我收到以下错误: Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 766, in run self._load_main_session(self.local_staging_directory) File "/usr/local/lib/python2.7/dist-

向GCP提交数据流作业时,我收到以下错误:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 766, in run
    self._load_main_session(self.local_staging_directory)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 482, in _load_main_session
    pickler.load_session(session_file)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 266, in load_session
    return dill.load_session(file_path)
  File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 402, in load_session
    module = unpickler.load()
  File "/usr/lib/python2.7/pickle.py", line 864, in load
    dispatch[key](self)
  File "/usr/lib/python2.7/pickle.py", line 1139, in load_reduce
    value = func(*args)
  File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 818, in _import_module
    return __import__(import_name)
ImportError: No module named tensorflow_transform

我的假设是,tensorflow transform和ApacheBeam等需求都是预先安装的,并且在几个月前就已经开始工作了

这里是解决方案,为面临相同问题的人提供

您需要将setup.py文件与正在运行的文件放在同一目录中,假设该文件包含所有梁步骤

import setuptools

setuptools.setup(
              name='whatever-name',
              version='0.0.1',
              install_requires=[
                  'apache-beam==2.10.0',
                  'tensorflow-transform==0.12.0'
                  ],
              packages=setuptools.find_packages(),
              )
在我的python文件中

options = PipelineOptions()
必须将其更改为:

options = PipelineOptions(setup_file="./setup.py")

伟大的解决方案!谷歌云应该在他们的教程中包含这个“setup.py”步骤。