Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/tfs/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Airflow 将PyFile和参数传递给DataProcPySparkOperator_Airflow_Google Cloud Composer - Fatal编程技术网

Airflow 将PyFile和参数传递给DataProcPySparkOperator

Airflow 将PyFile和参数传递给DataProcPySparkOperator,airflow,google-cloud-composer,Airflow,Google Cloud Composer,我正在尝试将参数和压缩的pyfile传递给Composer中的临时Dataproc集群 spark_args = { 'conn_id': 'spark_default', 'num_executors': 2, 'executor_cores': 2, 'executor_memory': '2G', 'driver_memory': '2G', } task = dataproc_operator.DataProcPySparkOperato

我正在尝试将参数和压缩的pyfile传递给Composer中的临时Dataproc集群

spark_args = {
    'conn_id': 'spark_default',
    'num_executors': 2,
    'executor_cores': 2,
    'executor_memory': '2G',
    'driver_memory': '2G',
}    

task = dataproc_operator.DataProcPySparkOperator(
                task_id='spark_preprocess_{}'.format(name),
                project_id=PROJECT_ID,
                cluster_name=CLUSTER_NAME,
                region='europe-west4',
                main='gs://my-bucket/dist/main.py',
                pyfiles='gs://my-bucket/dist/jobs.zip',
                dataproc_pyspark_properties=spark_args,
                arguments=['--name', 'test', '--date', self.date_exec],
                dag=subdag
            )
但是我得到了下面的错误,知道如何正确格式化参数吗

“job.pyspark\u job.properties[1].value”(类型\u字符串)处的值无效

正如中所指出的,问题在于
spark_args
具有非字符串值,但每个错误消息只应包含字符串:

Invalid value at 'job.pyspark_job.properties[1].value' (TYPE_STRING)

这可能是因为您在
spark\u args
中使用整数。属性是严格意义上的字符串,因此只需添加引号即可。注释有意义,因为conn_id是属性[0],num_executors是具有整数值的属性[1]@kwn,你能确认这是否解决了错误吗?它确实解决了错误,非常感谢@tix!