Google cloud dataflow NoneType参数不可iterable-正在使用流运行数据流作业
我想在Airflow上运行一个自动执行的数据流jar。 我在运行以下命令时出现异常:Google cloud dataflow NoneType参数不可iterable-正在使用流运行数据流作业,google-cloud-dataflow,airflow,Google Cloud Dataflow,Airflow,我想在Airflow上运行一个自动执行的数据流jar。 我在运行以下命令时出现异常: "airflow test test-dag hello-dag 2018-03-26" 我丢了什么东西?我找不到更多关于这方面的信息。 非常感谢你的帮助 一些版本: python 2.7.10 气流1.9.0 熊猫0.22.0 例外情况: Traceback (most recent call last): File "/Users/henry/Documents/workspace/py27venv/
"airflow test test-dag hello-dag 2018-03-26"
我丢了什么东西?我找不到更多关于这方面的信息。
非常感谢你的帮助
一些版本:
python 2.7.10
气流1.9.0
熊猫0.22.0
例外情况:
Traceback (most recent call last):
File "/Users/henry/Documents/workspace/py27venv/bin/airflow", line 27, in <module>
args.func(args)
File "/Users/henry/Documents/workspace/py27venv/lib/python2.7/site-packages/airflow/bin/cli.py", line 528, in test
ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True)
File "/Users/henry/Documents/workspace/py27venv/lib/python2.7/site-packages/airflow/utils/db.py", line 50, in wrapper
result = func(*args, **kwargs)
File "/Users/henry/Documents/workspace/py27venv/lib/python2.7/site-packages/airflow/models.py", line 1584, in run
session=session)
File "/Users/henry/Documents/workspace/py27venv/lib/python2.7/site-packages/airflow/utils/db.py", line 50, in wrapper
result = func(*args, **kwargs)
File "/Users/henry/Documents/workspace/py27venv/lib/python2.7/site-packages/airflow/models.py", line 1493, in _run_raw_task
result = task_copy.execute(context=context)
File "/Users/henry/Documents/workspace/py27venv/lib/python2.7/site-packages/airflow/contrib/operators/dataflow_operator.py", line 121, in execute
hook.start_java_dataflow(self.task_id, dataflow_options, self.jar)
File "/Users/henry/Documents/workspace/py27venv/lib/python2.7/site-packages/airflow/contrib/hooks/gcp_dataflow_hook.py", line 149, in start_java_dataflow
task_id, variables, dataflow, name, ["java", "-jar"])
File "/Users/henry/Documents/workspace/py27venv/lib/python2.7/site-packages/airflow/contrib/hooks/gcp_dataflow_hook.py", line 143, in _start_dataflow
self.get_conn(), variables['project'], name).wait_for_done()
File "/Users/henry/Documents/workspace/py27venv/lib/python2.7/site-packages/airflow/contrib/hooks/gcp_dataflow_hook.py", line 31, in __init__
self._job = self._get_job()
File "/Users/henry/Documents/workspace/py27venv/lib/python2.7/site-packages/airflow/contrib/hooks/gcp_dataflow_hook.py", line 49, in _get_job
if 'currentState' in job:
TypeError: argument of type 'NoneType' is not iterable
答案是
根据任务3中选项的设置,气流将执行以下命令
java -jar helloairflow-0.0.1-SNAPSHOT.jar --autoscalingAlgorithm=BASIC --maxNumWorkers=50 --start= --partitionType=DAY ...
但我并没有在我的主函数helloairflow-0.0.1-SNAPSHOT.jar中同时定义属性“start”和“partitionType”,然后我在另一个终端执行了上述命令,得到了以下异常
java.lang.IllegalArgumentException: Class interface com.henry.cloud.dataflow.connector.MariaDBConnector$MariaDBConnOptions missing a property named 'start'.
at org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1579)
at org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:104)
at org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:291)
at com.henry.cloud.dataflow.connector.MariaDBConnector.main(MariaDBConnector.java:90)
最后,我删除了task_3的options中的两个属性,效果很好。我找到了原因,选项需要与Java文件中的参数匹配。如果.jar导致异常,那么气流也会导致异常。请您提交一个答案和您的评论,好吗?
java.lang.IllegalArgumentException: Class interface com.henry.cloud.dataflow.connector.MariaDBConnector$MariaDBConnOptions missing a property named 'start'.
at org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1579)
at org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:104)
at org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:291)
at com.henry.cloud.dataflow.connector.MariaDBConnector.main(MariaDBConnector.java:90)