Google cloud platform 无法在谷歌云平台上创建Dataproc群集;NodeInitializationAction必须指定可执行文件;

Google cloud platform 无法在谷歌云平台上创建Dataproc群集;NodeInitializationAction必须指定可执行文件;,google-cloud-platform,google-cloud-dataproc,airflow-scheduler,Google Cloud Platform,Google Cloud Dataproc,Airflow Scheduler,在Google云平台上创建Dataproc集群时遇到以下错误。我们正在使用Mercury插件来实现气流。只是想了解一下问题是什么。我尝试了许多选择,但直到现在我还没有得出任何结论 [2020-03-26 06:04:14,612] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:14,612] {models.py:1352} INFO - Executing <Task(GoogleCloudCreateDatapro

在Google云平台上创建Dataproc集群时遇到以下错误。我们正在使用Mercury插件来实现气流。只是想了解一下问题是什么。我尝试了许多选择,但直到现在我还没有得出任何结论

[2020-03-26 06:04:14,612] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:14,612] {models.py:1352} INFO - Executing <Task(GoogleCloudCreateDataprocCluster): create_medax_cluster> on 2020-03-26 06:03:40.963562
[2020-03-26 06:04:14,649] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:14,649] {gcp_api_base_hook.py:73} INFO - Getting connection using `gcloud auth` user, since no key file is defined for hook.
[2020-03-26 06:04:14,655] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:14,655] {discovery.py:267} INFO - URL being requested: GET https://www.googleapis.com/discovery/v1/apis/dataproc/v1beta2/rest
[2020-03-26 06:04:14,655] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:14,655] {transport.py:157} INFO - Attempting refresh to obtain initial access_token
[2020-03-26 06:04:14,732] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:14,731] {discovery.py:866} INFO - URL being requested: GET https://dataproc.googleapis.com/v1beta2/projects/dh-xxxxxxxx-xxxxxxxx-72410/regions/xxxxxxxx/clusters?alt=json
[2020-03-26 06:04:15,455] {base_task_runner.py:95} INFO - Subtask: File gs://xxxxxxxx-xxxxxxxx-dpl-artif/dataproc/dataproc-init.sh will not be executed on dataproc startup.
[2020-03-26 06:04:15,455] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:15,455] {discovery.py:866} INFO - URL being requested: POST https://dataproc.googleapis.com/v1beta2/projects/dh-xxxxxxxx-xxxxxxxx-72410/regions/xxxxxxxx/clusters?alt=json
[2020-03-26 06:04:20,490] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:20,490] {discovery.py:866} INFO - URL being requested: GET https://dataproc.googleapis.com/v1beta2/projects/dh-xxxxxxxx-xxxxxxxx-72410/regions/xxxxxxxx/clusters?alt=json
[2020-03-26 06:04:20,534] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:20,533] {models.py:1427} ERROR - <HttpError 400 when requesting https://dataproc.googleapis.com/v1beta2/projects/dh-xxxxxxxx-xxxxxxxx-72410/regions/xxxxxxxx/clusters?alt=json returned "Multiple validation errors:
[2020-03-26 06:04:20,535] {base_task_runner.py:95} INFO - Subtask:  - NodeInitializationAction must specify executable
[2020-03-26 06:04:20,535] {base_task_runner.py:95} INFO - Subtask:  - Object URI '' is not a valid GCS URI">
[2020-03-26 06:04:20,535] {base_task_runner.py:95} INFO - Subtask: Traceback (most recent call last):
[2020-03-26 06:04:20,535] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1384, in run
[2020-03-26 06:04:20,535] {base_task_runner.py:95} INFO - Subtask:     result = task_copy.execute(context=context)
[2020-03-26 06:04:20,535] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/airflow/plugins/mercury_plugins.py", line 1516, in execute
[2020-03-26 06:04:20,535] {base_task_runner.py:95} INFO - Subtask:     raise e
[2020-03-26 06:04:20,536] {base_task_runner.py:95} INFO - Subtask: HttpError: <HttpError 400 when requesting https://dataproc.googleapis.com/v1beta2/projects/dh-xxxxxxxx-xxxxxxxx-72410/regions/xxxxxxxx/clusters?alt=json returned "Multiple validation errors:
[2020-03-26 06:04:20,536] {base_task_runner.py:95} INFO - Subtask:  - NodeInitializationAction must specify executable
[2020-03-26 06:04:20,536] {base_task_runner.py:95} INFO - Subtask:  - Object URI '' is not a valid GCS URI">
[2020-03-26 06:04:20,536] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:20,534] {models.py:1451} INFO - Marking task as FAILED.
[2020-03-26 06:04:20,537] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:20,537] {configuration.py:609} WARNING - section/key [smtp/smtp_user] not found in config
[2020-03-26 06:04:20,538] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:20,538] {models.py:1466} ERROR - Failed at executing callback
[2020-03-26 06:04:20,539] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:20,538] {models.py:1467} ERROR - [Errno 99] Cannot assign requested address
[2020-03-26 06:04:20,539] {base_task_runner.py:95} INFO - Subtask: Traceback (most recent call last):
[2020-03-26 06:04:20,539] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1464, in handle_failure
[2020-03-26 06:04:20,539] {base_task_runner.py:95} INFO - Subtask:     task.on_failure_callback(context)
[2020-03-26 06:04:20,539] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/airflow/dags/notification.py", line 35, in on_failure_callback
[2020-03-26 06:04:20,539] {base_task_runner.py:95} INFO - Subtask:     return operator.execute(context=context)
[2020-03-26 06:04:20,540] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/lib/python2.7/dist-packages/airflow/operators/email_operator.py", line 64, in execute
[2020-03-26 06:04:20,540] {base_task_runner.py:95} INFO - Subtask:     send_email(self.to, self.subject, self.html_content, files=self.files, cc=self.cc, bcc=self.bcc, mime_subtype=self.mime_subtype)
[2020-03-26 06:04:20,540] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/lib/python2.7/dist-packages/airflow/utils/email.py", line 44, in send_email
[2020-03-26 06:04:20,540] {base_task_runner.py:95} INFO - Subtask:     return backend(to, subject, html_content, files=files, dryrun=dryrun, cc=cc, bcc=bcc, mime_subtype=mime_subtype)
[2020-03-26 06:04:20,540] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/lib/python2.7/dist-packages/airflow/utils/email.py", line 86, in send_email_smtp
[2020-03-26 06:04:20,541] {base_task_runner.py:95} INFO - Subtask:     send_MIME_email(SMTP_MAIL_FROM, recipients, msg, dryrun)
[2020-03-26 06:04:20,541] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/lib/python2.7/dist-packages/airflow/utils/email.py", line 104, in send_MIME_email
[2020-03-26 06:04:20,541] {base_task_runner.py:95} INFO - Subtask:     s = smtplib.SMTP_SSL(SMTP_HOST, SMTP_PORT) if SMTP_SSL else smtplib.SMTP(SMTP_HOST, SMTP_PORT)
[2020-03-26 06:04:20,541] {base_task_runner.py:95} INFO - Subtask:   File "/usr/lib/python2.7/smtplib.py", line 256, in __init__
[2020-03-26 06:04:20,542] {base_task_runner.py:95} INFO - Subtask:     (code, msg) = self.connect(host, port)
[2020-03-26 06:04:20,542] {base_task_runner.py:95} INFO - Subtask:   File "/usr/lib/python2.7/smtplib.py", line 316, in connect
[2020-03-26 06:04:20,542] {base_task_runner.py:95} INFO - Subtask:     self.sock = self._get_socket(host, port, self.timeout)
[2020-03-26 06:04:20,542] {base_task_runner.py:95} INFO - Subtask:   File "/usr/lib/python2.7/smtplib.py", line 291, in _get_socket
[2020-03-26 06:04:20,542] {base_task_runner.py:95} INFO - Subtask:     return socket.create_connection((host, port), timeout)
[2020-03-26 06:04:20,543] {base_task_runner.py:95} INFO - Subtask:   File "/usr/lib/python2.7/socket.py", line 575, in create_connection
[2020-03-26 06:04:20,543] {base_task_runner.py:95} INFO - Subtask:     raise err
[2020-03-26 06:04:20,543] {base_task_runner.py:95} INFO - Subtask: error: [Errno 99] Cannot assign requested address
[2020-03-26 06:04:20,553] {base_task_runner.py:95} INFO - Subtask: [2020-03-26 06:04:20,553] {models.py:1472} ERROR - <HttpError 400 when requesting https://dataproc.googleapis.com/v1beta2/projects/dh-xxxxxxxx-xxxxxxxx-72410/regions/xxxxxxxx/clusters?alt=json returned "Multiple validation errors:
[2020-03-26 06:04:20,553] {base_task_runner.py:95} INFO - Subtask:  - NodeInitializationAction must specify executable
[2020-03-26 06:04:20,554] {base_task_runner.py:95} INFO - Subtask:  - Object URI '' is not a valid GCS URI">
[2020-03-26 06:04:20,554] {base_task_runner.py:95} INFO - Subtask: Traceback (most recent call last):
[2020-03-26 06:04:20,554] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/bin/airflow", line 28, in <module>
[2020-03-26 06:04:20,554] {base_task_runner.py:95} INFO - Subtask:     args.func(args)
[2020-03-26 06:04:20,554] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 422, in run
[2020-03-26 06:04:20,554] {base_task_runner.py:95} INFO - Subtask:     pool=args.pool,
[2020-03-26 06:04:20,555] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/lib/python2.7/dist-packages/airflow/utils/db.py", line 53, in wrapper
[2020-03-26 06:04:20,555] {base_task_runner.py:95} INFO - Subtask:     result = func(*args, **kwargs)
[2020-03-26 06:04:20,555] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1384, in run
[2020-03-26 06:04:20,555] {base_task_runner.py:95} INFO - Subtask:     result = task_copy.execute(context=context)
[2020-03-26 06:04:20,555] {base_task_runner.py:95} INFO - Subtask:   File "/usr/local/airflow/plugins/mercury_plugins.py", line 1516, in execute
[2020-03-26 06:04:20,555] {base_task_runner.py:95} INFO - Subtask:     raise e
[2020-03-26 06:04:20,556] {base_task_runner.py:95} INFO - Subtask: googleapiclient.errors.HttpError: <HttpError 400 when requesting https://dataproc.googleapis.com/v1beta2/projects/dh-xxxxxxxx-xxxxxxxx-72410/regions/xxxxxxxx/clusters?alt=json returned "Multiple validation errors:
[2020-03-26 06:04:20,556] {base_task_runner.py:95} INFO - Subtask:  - NodeInitializationAction must specify executable
[2020-03-26 06:04:20,556] {base_task_runner.py:95} INFO - Subtask:  - Object URI '' is not a valid GCS URI">
[2020-03-26 06:04:22,646] {jobs.py:2107} INFO - Task exited with return code 1
[2020-03-26 06:04:14612]{base_task_runner.py:95}信息-子任务:[2020-03-26 06:04:14612]{models.py:1352}信息-在2020-03-26 06:03:40.963562执行
[2020-03-26 06:04:14649]{base_task_runner.py:95}信息-子任务:[2020-03-26 06:04:14649]{gcp_api_base_hook.py:73}信息-使用`gcloud auth`用户获取连接,因为没有为hook定义密钥文件。
[2020-03-26 06:04:14655]{base_task_runner.py:95}信息-子任务:[2020-03-26 06:04:14655]{discovery.py:267}信息-请求的URL:获取https://www.googleapis.com/discovery/v1/apis/dataproc/v1beta2/rest
[2020-03-26 06:04:14655]{base_task_runner.py:95}信息-子任务:[2020-03-26 06:04:14655]{transport.py:157}信息-尝试刷新以获取初始访问令牌
[2020-03-26 06:04:14732]{base_task_runner.py:95}信息-子任务:[2020-03-26 06:04:14731]{discovery.py:866}信息-请求的URL:获取https://dataproc.googleapis.com/v1beta2/projects/dh-xxxxxxxx-xxxxxxxx-72410/regions/xxxxxxxx/clusters?alt=json
[2020-03-26 06:04:15455]{base_task_runner.py:95}信息-子任务:文件gs://xxxxxxxx-xxxxxxxx-dpl-artif/dataproc/dataproc-init.sh不会在dataproc启动时执行。
[2020-03-2606:04:15455]{base_task_runner.py:95}信息-子任务:[2020-03-2606:04:15455]{discovery.py:866}信息-请求的URL:POSThttps://dataproc.googleapis.com/v1beta2/projects/dh-xxxxxxxx-xxxxxxxx-72410/regions/xxxxxxxx/clusters?alt=json
[2020-03-26 06:04:20490]{base_task_runner.py:95}信息-子任务:[2020-03-26 06:04:20490]{discovery.py:866}信息-请求的URL:获取https://dataproc.googleapis.com/v1beta2/projects/dh-xxxxxxxx-xxxxxxxx-72410/regions/xxxxxxxx/clusters?alt=json
[2020-03-26 06:04:20534]{base_task_runner.py:95}信息-子任务:[2020-03-26 06:04:20533]{models.py:1427}错误-
[2020-03-26 06:04:20535]{base_task_runner.py:95}信息-子任务:回溯(最近一次调用最后一次):
[2020-03-26 06:04:20535]{base_task_runner.py:95}信息-子任务:文件“/usr/local/lib/python2.7/dist packages/aiffort/models.py”,第1384行,运行中
[2020-03-26 06:04:20535]{base_task_runner.py:95}INFO-子任务:result=task_copy.execute(context=context)
[2020-03-26 06:04:20535]{base_task_runner.py:95}信息-子任务:文件“/usr/local/aiffair/plugins/mercury_plugins.py”,执行中第1516行
[2020-03-2606:04:20535]{base_task_runner.py:95}信息-子任务:提升e
[2020-03-26 06:04:20536]{base_task_runner.py:95}信息-子任务:HttpError:
[2020-03-26 06:04:20536]{base_task_runner.py:95}信息-子任务:[2020-03-26 06:04:20534]{models.py:1451}信息-将任务标记为失败。
[2020-03-26 06:04:20537]{base_task_runner.py:95}信息-子任务:[2020-03-26 06:04:20537]{configuration.py:609}警告-配置中找不到节/键[smtp/smtp_user]
[2020-03-26 06:04:20538]{base_task_runner.py:95}信息-子任务:[2020-03-26 06:04:20538]{models.py:1466}错误-执行回调时失败
[2020-03-26 06:04:20539]{base_task_runner.py:95}信息-子任务:[2020-03-26 06:04:20538]{models.py:1467}错误-[Errno 99]无法分配请求的地址
[2020-03-26 06:04:20539]{base_task_runner.py:95}信息-子任务:回溯(最近一次调用last):
[2020-03-26 06:04:20539]{base_task_runner.py:95}信息-子任务:文件“/usr/local/lib/python2.7/dist packages/aiffort/models.py”,第1464行,在handle_failure中
[2020-03-26 06:04:20539]{base_task_runner.py:95}信息-子任务:task.on_failure_回调(上下文)
[2020-03-26 06:04:20539]{base_task_runner.py:95}INFO-子任务:文件“/usr/local/aiffair/dags/notification.py”,第35行,在on_failure_回调中
[2020-03-26 06:04:20539]{base_task_runner.py:95}INFO-子任务:返回操作符.execute(context=context)
[2020-03-26 06:04:20540]{base_task_runner.py:95}信息-子任务:文件“/usr/local/lib/python2.7/dist packages/aiffort/operators/email_operator.py”,执行中第64行
[2020-03-26 06:04:20540]{base_task_runner.py:95}信息-子任务:发送电子邮件(self.to、self.subject、self.html_内容、files=self.files、cc=self.cc、bcc=self.bcc、mime_子类型=self.mime_子类型)
[2020-03-26 06:04:20540]{base_task_runner.py:95}信息-子任务:文件“/usr/local/lib/python2.7/dist packages/aiffair/utils/email.py”,第44行,发送电子邮件
[2020-03-26 06:04:20540]{base_task_runner.py:95}信息-子任务:返回后端(to,subject,html_内容,files=files,dryrun=dryrun,cc=cc,bcc=bcc,mime_subtype=mime_subtype)
[2020-03-26 06:04:20540]{base_task_runner.py:95}信息-子任务:文件“/usr/local/lib/python2.7/dist packages/aiffair/utils/email.py”,第86行,发送电子邮件
[2020-03-26 06:04:20541]{base_task_runner.py:95}信息-子任务:发送MIME电子邮件(SMTP_MAIL_FROM,recipients,msg,dryrun)
[2020-03-26 06:04:20541]{base_task_runner.py:95}信息-子任务:文件“/usr/local/lib/python2.7/dist packages/aiffair/utils/email.py”,第104行,在发送电子邮件中
[2020-03-26 06:04:20541]{base_task_runner.py:95}INFO-子任务:s=smtplib.SMTP_SSL(SMTP_主机,SMTP_端口)如果SMTP_SSL,则为smtplib.SMTP(SMTP_主机,SMTP_端口)
[2020-03-26 06:04:20541]{base_task_runner.py:95}信息-子任务:文件“/usr/lib/python2.7/smtplib.py”,第256行,在__
[2020-03-2606:04:20542]{base_task_runner.py:95}INFO-子任务:(代码,消息)=self.connect(主机,端口)
[2020-03-26 06:04:20542]{base_task_runner.py:95}信息-子任务:文件“/usr/lib/python2.7/smtplib.py”,第316行,在connect中
[2020-03-26 06:04:20542]{base_task_runner.py:95}INFO-子任务:self.sock=self.\u get_socket(主机、端口、self.timeout)
[2020-03-2606:04:20542]{base_task_runner.py:95}INFO-Su
INFO - Subtask:  - Object URI '' is not a valid GCS URI">