Google cloud dataflow 数据流运行程序-由于401错误而刷新
在DataflowRunner上运行管道(适用于Python的Google Cloud Dataflow SDK 0.5.5) 管道:Google cloud dataflow 数据流运行程序-由于401错误而刷新,google-cloud-dataflow,Google Cloud Dataflow,在DataflowRunner上运行管道(适用于Python的Google Cloud Dataflow SDK 0.5.5) 管道: (p | 'Read trip from BigQuery' >> beam.io.Read(beam.io.BigQuerySource(query=known_args.input)) | 'Convert' >> beam.Map(lambda row: (row['HardwareId'],row)) |
(p
| 'Read trip from BigQuery' >> beam.io.Read(beam.io.BigQuerySource(query=known_args.input))
| 'Convert' >> beam.Map(lambda row: (row['HardwareId'],row))
| 'Group devices' >> beam.GroupByKey()
| 'Pull way info from mapserver' >> beam.FlatMap(get_osm_way)
| 'Map way info to dictionary' >> beam.FlatMap(convert_to_dict)
| 'Save to BQ' >> beam.io.Write(beam.io.BigQuerySink(
known_args.output, schema=schema_string,
create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
write_disposition=beam.io.BigQueryDisposition.WRITE_TRUNCATE))
)
它被设定为自动校准,跑步者增加了15名工人
更详细的代码:
运行约2小时后,该公司报告:
19:41:19.908
Attempting refresh to obtain initial access_token
{
insertId: "jf9yr4g1sv0qku"
jsonPayload: {
message: "Attempting refresh to obtain initial access_token"
worker: "beamapp-root-0216221014-5-02161410-29cb-harness-xqx2"
logger: "oauth2client.client:client.py:new_request"
thread: "110:140052132222720"
job: "2017-02-16_14_10_18-17481182243152998182"
}
resource: {…}
timestamp: "2017-02-17T00:41:19.908143997Z"
severity: "INFO"
labels: {…}
logName: "projects/fiona-zhao/logs/dataflow.googleapis.com%2Fworker"
}
并开始持续报告“由于401故障而刷新”。其中之一是:
21:45:12.886
Refreshing due to a 401 (attempt 1/2)
{
insertId: "zsorfgg1urhvty"
jsonPayload: {
worker: "beamapp-root-0216221014-5-02161410-29cb-harness-xqx2"
logger: "oauth2client.client:client.py:new_request"
thread: "110:140052273633024"
job: "2017-02-16_14_10_18-17481182243152998182"
message: "Refreshing due to a 401 (attempt 1/2)"
}
resource: {…}
timestamp: "2017-02-17T02:45:12.886137962Z"
severity: "INFO"
labels: {
compute.googleapis.com/resource_name: "dataflow-beamapp-root-0216221014-5-02161410-29cb-harness-xqx2"
dataflow.googleapis.com/job_id: "2017-02-16_14_10_18-17481182243152998182"
dataflow.googleapis.com/job_name: "beamapp-root-0216221014-530646"
dataflow.googleapis.com/region: "global"
compute.googleapis.com/resource_type: "instance"
compute.googleapis.com/resource_id: "2301951363070532306"
}
logName: "projects/fiona-zhao/logs/dataflow.googleapis.com%2Fworker"
}
我能做些什么?这些日志消息是执行的正常部分,其本身并不反映错误。我的建议是添加额外的日志来调试挂起的外部API调用或执行步骤
虽然我们不能在这个开放论坛上评论特定作业的具体执行细节,但云数据流团队可以在数据流方面提供更多支持-feedback@google.com邮件列表。这些日志消息是执行的正常部分,其本身并不反映错误。我的建议是添加额外的日志来调试挂起的外部API调用或执行步骤
虽然我们不能在这个开放论坛上评论特定作业的具体执行细节,但云数据流团队可以在数据流方面提供更多支持-feedback@google.com邮件列表。您以前是否成功运行过此作业?它看起来确实运行了很长时间。这是不寻常的吗?你以前成功地完成过这项工作吗?它看起来确实运行了很长时间。这不寻常吗?