Google cloud dataflow 数据流无法设置工作进程

Google cloud dataflow 数据流无法设置工作进程,google-cloud-dataflow,Google Cloud Dataflow,在DirectRunner上测试了我的管道,一切正常。 现在我想在DataflowRunner上运行它。它不起作用。它甚至在输入我的管道代码之前就失败了,我完全被stackdriver中的日志淹没了——只是不明白它们的意思,也不知道到底出了什么问题 执行图看起来很好 工作线程池启动,1个工作线程尝试运行整个安装过程,但看起来从未成功 我猜一些日志可能会为调试提供有用的信息: AttributeError:'module'对象没有属性'NativeSource' /usr/bin/python失

在DirectRunner上测试了我的管道,一切正常。 现在我想在DataflowRunner上运行它。它不起作用。它甚至在输入我的管道代码之前就失败了,我完全被stackdriver中的日志淹没了——只是不明白它们的意思,也不知道到底出了什么问题

  • 执行图看起来很好
  • 工作线程池启动,1个工作线程尝试运行整个安装过程,但看起来从未成功
  • 我猜一些日志可能会为调试提供有用的信息:

    AttributeError:'module'对象没有属性'NativeSource'
    /usr/bin/python失败,退出状态为1
    后退20秒重新启动失败的容器=python pod=dataflow-fiona-backlog-clean-test2-06140817-1629-harness-3nxh_默认值(50a3915d6501a3ec74d6d385f70c8353)
    正在检查pod“dataflow-fiona-backlog-clean-test2-06140817-1629-harness-3nxh”中容器“python”的回退 INFO SSH密钥不是完整的条目:………

    我应该如何解决这个问题

编辑: my setup.py(如果有帮助的话)此处:(从复制,仅修改
必需的\u软件包
setuptools.setup
部分)

工作程序启动日志:并在以下异常情况下结束

I  /usr/bin/python failed with exit status 1 
I  /usr/bin/python failed with exit status 1 
I  AttributeError: 'module' object has no attribute 'NativeSource' 
I      class ConcatSource(iobase.NativeSource): 
I    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/concat_reader.py", line 26, in <module> 
I      from dataflow_worker import concat_reader 
I    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/maptask.py", line 31, in <module> 
I      from dataflow_worker import maptask 
I    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 26, in <module> 
I      from dataflow_worker import executor 
I    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 63, in <module> 
I      from dataflow_worker import batchworker 
I    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/start.py", line 26, in <module> 
I      exec code in run_globals 
I    File "/usr/lib/python2.7/runpy.py", line 72, in _run_code 
I      "__main__", fname, loader, pkg_name) 
I    File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main 
I  AttributeError: 'module' object has no attribute 'NativeSource' 
I      class ConcatSource(iobase.NativeSource): 
I/usr/bin/python失败,退出状态为1
I/usr/bin/python失败,退出状态为1
I AttributeError:“模块”对象没有属性“NativeSource”
I类ConcatSource(iobase.NativeSource):
I文件“/usr/local/lib/python2.7/dist packages/dataflow\u worker/concat\u reader.py”,第26行,在
我从dataflow\u worker导入concat\u阅读器
I文件“/usr/local/lib/python2.7/dist packages/dataflow\u worker/maptask.py”,第31行,在
来自数据流的I\u工作者导入映射任务
I文件“/usr/local/lib/python2.7/dist packages/dataflow\u worker/executor.py”,第26行,在
来自数据流的I\u工作者导入执行器
I文件“/usr/local/lib/python2.7/dist-packages/dataflow\u-worker/batchworker.py”,第63行,在
我从dataflow\u worker导入batchworker
I文件“/usr/local/lib/python2.7/dist-packages/dataflow\u-worker/start.py”,第26行,在
我在run_globals中执行代码
I文件“/usr/lib/python2.7/runpy.py”,第72行,在运行代码中
I“\uuuu main\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
I文件“/usr/lib/python2.7/runpy.py”,第162行,在运行模块中作为主模块
I AttributeError:“模块”对象没有属性“NativeSource”
I类ConcatSource(iobase.NativeSource):

您似乎在
必需的\u包
指令中使用了不兼容的需求,即您指定了
“apache beam==2.0.0”
“google cloud dataflow==0.6.0”
,这两个指令相互冲突。您是否可以尝试删除/卸载
“apache beam”
软件包,并安装/包含
“google cloud dataflow==2.0.0”
软件包?

您是否可以共享作业ID?在云日志记录中,其他一些日志(如worker、worker_startup等)中可能还包含一些关于worker没有启动的原因的详细信息。@BenChambers JobID:2017-06-13_14_00_46-14992846121311670805942017-06-14_08_17_20-100619994; 0805167645。我可以从worker启动日志中看到异常。我将它们添加到我的帖子中。在数据流上工作真是令人沮丧。在我的本地runner上运行的东西可能会因为Cloud runner的原因而失败。这真的很让人困惑(不是你的答案,而是文档)。每个Google dataflow文档都说它现在基于Apache Beam,并将我引向Beam网站。另外,如果我查找github项目,我会看到google数据流项目是空的,所有内容都转到apachebeam repo。这是否意味着googlecloud数据流不受欢迎,我应该安装apachebeam?另外,我尝试卸载apache beam并仅保留google cloud数据流,directrunner出现错误:ImportError:没有名为options.pipeline\u options的模块。似乎在google cloud dataflow下,beam软件包仍然在旧版本上,但我找不到任何来自google的支持API文档来解释版本选择……感谢您的反馈。这是我们需要解决的一个重要问题,我们意识到这可能会造成混乱。我们计划在下一版本中解决这个问题。
I  /usr/bin/python failed with exit status 1 
I  /usr/bin/python failed with exit status 1 
I  AttributeError: 'module' object has no attribute 'NativeSource' 
I      class ConcatSource(iobase.NativeSource): 
I    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/concat_reader.py", line 26, in <module> 
I      from dataflow_worker import concat_reader 
I    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/maptask.py", line 31, in <module> 
I      from dataflow_worker import maptask 
I    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 26, in <module> 
I      from dataflow_worker import executor 
I    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 63, in <module> 
I      from dataflow_worker import batchworker 
I    File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/start.py", line 26, in <module> 
I      exec code in run_globals 
I    File "/usr/lib/python2.7/runpy.py", line 72, in _run_code 
I      "__main__", fname, loader, pkg_name) 
I    File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main 
I  AttributeError: 'module' object has no attribute 'NativeSource' 
I      class ConcatSource(iobase.NativeSource):