Google cloud dataflow 数据流-分配自定义时间戳时出错_Google Cloud Dataflow_Apache Beam

Google cloud dataflow 数据流-分配自定义时间戳时出错

google-cloud-dataflow

Google cloud dataflow 数据流-分配自定义时间戳时出错,google-cloud-dataflow,apache-beam,Google Cloud Dataflow,Apache Beam,我试图分配一个自定义的时间戳，并检查允许的迟到是如何工作的。当我在interactive（）runner（）中运行下面的代码时，它工作正常，但当我切换到dataflowrunner（）时，它开始抛出错误 Map(lambda x: window.TimestampedValue(x, x["timestamp"])) 两种情况下的输入数据相同，即{'name'：'rou'，'score'：50，'timestamp'：1618295060}。在DataflowUI中，我没

我试图分配一个自定义的时间戳，并检查允许的迟到是如何工作的。当我在interactive（）runner（）中运行下面的代码时，它工作正常，但当我切换到dataflowrunner（）时，它开始抛出错误

Map(lambda x: window.TimestampedValue(x, x["timestamp"]))

两种情况下的输入数据相同，即{'name'：'rou'，'score'：50，'timestamp'：1618295060}。在DataflowUI中，我没有看到任何错误，但我看不到错误的详细信息。我包括了日志记录和异常。我不知道为什么没有记录错误

乍一看，似乎您正在以字符串形式发送timestamp参数，我认为它应该是integer/float。另外，直接转到云日志或单击步骤本身，可以更好地查看日志。这看起来像个bug。您是否尝试打开日志资源管理器以查找任何错误日志？你看过工人日志了吗？

class BuildRecordFn(beam.DoFn):
def __init__(self):
    super(BuildRecordFn, self).__init__()

def process(self, s,  window=beam.DoFn.WindowParam):
    #window_start = window.start.to_utc_datetime()
    window_end = window.end.to_utc_datetime()
    return [dict(name=s[0],score=s[1], timestamp=str(window_end))]
windowed_words = (words_source 
              | "read" >> 
               beam.io.ReadFromPubSub(topic="projects/{}/topics/beambasics".format(project))
              |"To Dict" >> beam.Map(json.loads)
              |"with timestamp">>  Map(lambda x: window.TimestampedValue(x, x["timestamp"]))
              |"Map" >>Map(lambda x : (x['name'],x['score']))
              | "window" >> beam.WindowInto(window.FixedWindows(60),
                                            #trigger=Repeatedly(AfterProcessingTime(1 * 10)),
                  # accumulation_mode=AccumulationMode.ACCUMULATING,                         
             allowed_lateness=Duration(seconds=1*50))
             |"Group">> CombinePerKey(sum) 
             |"convert to dict">> ParDo(BuildRecordFn())
             |"Write To BigQuery" >> WriteToBigQuery(table=table, schema=schema,
                              create_disposition=BigQueryDisposition.CREATE_IF_NEEDED,
                              write_disposition=BigQueryDisposition.WRITE_APPEND) 
             )