Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/google-cloud-platform/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Google cloud platform 仅在apache数据流中的同一管道中执行前一个步骤时执行某些步骤_Google Cloud Platform_Google Cloud Dataflow_Dataflow - Fatal编程技术网

Google cloud platform 仅在apache数据流中的同一管道中执行前一个步骤时执行某些步骤

Google cloud platform 仅在apache数据流中的同一管道中执行前一个步骤时执行某些步骤,google-cloud-platform,google-cloud-dataflow,dataflow,Google Cloud Platform,Google Cloud Dataflow,Dataflow,我想在几个开始步骤之后执行几个步骤。 在我的例子中,我想先执行3个步骤,然后执行最后2个步骤 with beam.Pipeline(options=pipeline_options) as p1: data_csv = p1 | 'Read CSV file' >> ReadFromText(known_args.input_csv_file) dict1 = (data_csv | 'Format to json' >> (beam.ParDo(Spli

我想在几个开始步骤之后执行几个步骤。 在我的例子中,我想先执行3个步骤,然后执行最后2个步骤

with beam.Pipeline(options=pipeline_options) as p1:
    data_csv = p1 | 'Read CSV file' >> ReadFromText(known_args.input_csv_file)
    dict1 = (data_csv | 'Format to json' >> (beam.ParDo(Split())))
    (dict1 | 'Write to BigQuery' >> beam.io.WriteToBigQuery(
                                        known_args.output_stage_bq,
                                        schema=product_revenue_schema
                                        ))
    fullTable = (p1 | 'ReadFromBQ' >> beam.io.Read(beam.io.BigQuerySource(table_spec)))
    (fullTable | 'writeToBQ another dataset' >> beam.io.WriteToBigQuery(known_args.output_target_bq,
                            schema = product_revenue_schema))
一旦这3个步骤完成执行,我想开始最后2个步骤

with beam.Pipeline(options=pipeline_options) as p1:
    data_csv = p1 | 'Read CSV file' >> ReadFromText(known_args.input_csv_file)
    dict1 = (data_csv | 'Format to json' >> (beam.ParDo(Split())))
    (dict1 | 'Write to BigQuery' >> beam.io.WriteToBigQuery(
                                        known_args.output_stage_bq,
                                        schema=product_revenue_schema
                                        ))
    fullTable = (p1 | 'ReadFromBQ' >> beam.io.Read(beam.io.BigQuerySource(table_spec)))
    (fullTable | 'writeToBQ another dataset' >> beam.io.WriteToBigQuery(known_args.output_target_bq,
                            schema = product_revenue_schema))
预期:1:Step1->step2->step3->step4->step5

实际:1:步骤1->步骤2->步骤3 2:Step4->Step5

在BeamJavaSDK中,transform是您需要的


在Beam Python SDK中,目前还没有这样的转换。您应该使用两个单独的管道并手动同步它们(例如,在启动第二个管道之前等待第一个管道完成,或者从第一个管道发送pubsub消息以向第二个管道发出写入已完成的信号)。

谢谢@ihji,一个疑问。ApacheBeam支持在Python中创建多个管道吗?