Streaming 如何限制streming作业、apachebeam、dataflow后端、python的DoFn线程数_Streaming_Apache Beam_Dataflow

Streaming 如何限制streming作业、apachebeam、dataflow后端、python的DoFn线程数

streaming

Streaming 如何限制streming作业、apachebeam、dataflow后端、python的DoFn线程数,streaming,apache-beam,dataflow,Streaming,Apache Beam,Dataflow,我在ApacheBeam中使用流作业（dataflow后端，python SDK）时遇到了大量并行工作程序的问题使用无限数量的工人初始化sdkhaness。 beam似乎从一个VM/worker开始在几秒钟内生成数百个DoFn实例我在源代码中找不到可以限制这个“无限”数字的地方我需要限制它们，因为在process（）和setup（）中，我有外部呼叫，需要减少传出的RPS。如果您使用的是runner v2，请通过以下方式启用： --experiments=use_runner_v2 可以使

我在ApacheBeam中使用流作业（dataflow后端，python SDK）时遇到了大量并行工作程序的问题

使用无限数量的工人初始化sdkhaness。

beam似乎从一个VM/worker开始在几秒钟内生成数百个DoFn实例

我在源代码中找不到可以限制这个“无限”数字的地方

我需要限制它们，因为在

process（）

和

setup（）

中，我有外部呼叫，需要减少传出的RPS。

如果您使用的是runner v2，请通过以下方式启用：

--experiments=use_runner_v2

可以使用以下参数定义每个进程的线程数：

--number_of_worker_harness_threads

如果您正在使用runner v2，请通过以下方式启用：

--experiments=use_runner_v2

可以使用以下参数定义每个进程的线程数：

--number_of_worker_harness_threads

我想说的是，这个变量现在用于工人数量，而不是流媒体中的数百个线程如果self.debug\u options.number\u of_worker\u harness\u threads:pool.numThreadsPerWorker=（self.debug\u options.number\u of_worker\u harness\u threads）numberworker：每个辅助线束的线程数。如果为空或未指定，服务将选择多个线程（根据批处理所选机器类型上的内核数，或按惯例选择1个用于流式处理），我会说，这个变量现在用于工作线程数，而不是流式处理中的数百个线程如果self.debug\u options.number\u of_worker\u harness\u threads:pool.numThreadsPerWorker=（self.debug\u options.number\u of_worker\u harness\u threads）numberworker：每个辅助线束的线程数。如果为空或未指定，服务将选择多个线程（根据所选机器类型上的内核数进行批处理，或根据惯例选择1个线程进行流处理）