Spring批处理管理远程分区步骤最多运行8个线程，即使并发度为10？_Spring_Spring Batch_Spring Integration_Spring Batch Admin

Spring批处理管理远程分区步骤最多运行8个线程，即使并发度为10？

spring spring-batch spring-integration

Spring批处理管理远程分区步骤最多运行8个线程，即使并发度为10？,spring,spring-batch,spring-integration,spring-batch-admin,Spring,Spring Batch,Spring Integration,Spring Batch Admin,我正在为批处理使用spring批处理远程分区。我正在使用spring批处理管理启动作业我将入站网关使用者并发步骤设置为10，但并行运行的分区的最大数量为8 我希望稍后将使用者并发性增加到15 下面是我的配置 <task:executor id="taskExecutor" pool-size="50" /> <rabbit:template id="computeAmqpTemplate" connection-factory="rabbitConnectionFa

我正在为批处理使用spring批处理远程分区。我正在使用spring批处理管理启动作业

我将入站网关使用者并发步骤设置为10，但并行运行的分区的最大数量为8

我希望稍后将使用者并发性增加到15

下面是我的配置

<task:executor id="taskExecutor" pool-size="50" />

<rabbit:template id="computeAmqpTemplate"
    connection-factory="rabbitConnectionFactory" routing-key="computeQueue"
    reply-timeout="${compute.partition.timeout}">
</rabbit:template>

<int:channel id="computeOutboundChannel">
    <int:dispatcher task-executor="taskExecutor" />
</int:channel>

<int:channel id="computeInboundStagingChannel" />

<amqp:outbound-gateway request-channel="computeOutboundChannel"
    reply-channel="computeInboundStagingChannel" amqp-template="computeAmqpTemplate"
    mapped-request-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS"
    mapped-reply-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS" />


<beans:bean id="computeMessagingTemplate"
    class="org.springframework.integration.core.MessagingTemplate"
    p:defaultChannel-ref="computeOutboundChannel"
    p:receiveTimeout="${compute.partition.timeout}" />


<beans:bean id="computePartitionHandler"
    class="org.springframework.batch.integration.partition.MessageChannelPartitionHandler"
    p:stepName="computeStep" p:gridSize="${compute.grid.size}"
    p:messagingOperations-ref="computeMessagingTemplate" />

<int:aggregator ref="computePartitionHandler"
    send-partial-result-on-expiry="true" send-timeout="${compute.step.timeout}"
    input-channel="computeInboundStagingChannel" />

<amqp:inbound-gateway concurrent-consumers="${compute.consumer.concurrency}"
    request-channel="computeInboundChannel" 
    reply-channel="computeOutboundStagingChannel" queue-names="computeQueue"
    connection-factory="rabbitConnectionFactory"
    mapped-request-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS"
    mapped-reply-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS" />


<int:channel id="computeInboundChannel" />

<int:service-activator ref="stepExecutionRequestHandler"
    input-channel="computeInboundChannel" output-channel="computeOutboundStagingChannel" />

<int:channel id="computeOutboundStagingChannel" />

<beans:bean id="computePartitioner"
    class="org.springframework.batch.core.partition.support.MultiResourcePartitioner"
    p:resources="file:${spring.tmp.batch.dir}/#{jobParameters[batch_id]}/shares_rics/shares_rics_*.txt"
    scope="step" />



<beans:bean id="computeFileItemReader"
    class="org.springframework.batch.item.file.FlatFileItemReader"
    p:resource="#{stepExecutionContext[fileName]}" p:lineMapper-ref="stLineMapper"
    scope="step" />

<beans:bean id="computeItemWriter"
    class="com.st.batch.foundation.writers.ComputeItemWriter"
    p:symfony-ref="symfonyStepScoped" p:timeout="${compute.item.timeout}"
    p:batchId="#{jobParameters[batch_id]}" scope="step" />


<step id="computeStep">
    <tasklet transaction-manager="transactionManager">
        <chunk reader="computeFileItemReader" writer="computeItemWriter"
            commit-interval="${compute.commit.interval}" />
    </tasklet>
</step>

<flow id="computeFlow">
    <step id="computeStep.master">
        <partition partitioner="computePartitioner"
            handler="computePartitionHandler" />
    </step>
</flow>

<job id="computeJob" restartable="true">
    <flow id="computeJob.computeFlow" parent="computeFlow" />
</job>



compute.grid.size = 112
compute.consumer.concurrency = 10

Input files are splited to 112 equal parts = compute.grid.size = total number of partitions

Number of servers = 4.

<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
    <property name="jobRepository" ref="jobRepository" />
    <property name="taskExecutor" ref="jobLauncherTaskExecutor" />
</bean>

<task:executor id="jobLauncherTaskExecutor" pool-size="6" rejection-policy="ABORT" />


compute.grid.size=112
compute.consumer.concurrency=10
输入文件被分成112个相等的部分=compute.grid.size=分区总数
服务器数量=4。

有两个问题,

i）即使我已将并发设置为10，但运行的最大线程数为8

(ii)

有些进程运行得比较慢，而有些进程运行得比较快，所以我希望确保步骤执行是公平分布的，即，如果执行速度较快的服务器完成了它们的执行，队列中其他剩余的执行应该转到它们那里。它不应该以时尚的方式分发给每个人

我知道在rabbitmq中，有预取计数设置和ack模式来分配数据。对于spring集成，预取计数默认为1，确认模式默认为自动。但是仍然有一些服务器继续运行更多的分区，即使其他服务器已经运行了很长时间。理想情况下，服务器不应处于空闲状态

更新：

我现在观察到的另一件事是，对于一些使用split并行运行的步骤（不是使用远程分区分发的），也并行运行max 8。看起来有点像线程池限制问题，但正如您所看到的，taskExecutor将池大小设置为50

spring batch/spring batch admin中是否有限制并发运行步骤数量的内容

第二次更新：

而且，如果有8个或更多线程在并行处理项目中运行，spring batch admin不会加载。它只是挂着。如果我降低并发性，spring批处理管理将加载。我甚至在一台服务器上设置了并发4，在另一台服务器上设置了并发8，spring batch admin没有加载它，我使用了运行8个线程的服务器的URL，但它在运行4个线程的服务器上工作

Spring batch admin manager具有以下jobLauncher配置：

<task:executor id="taskExecutor" pool-size="50" />

<rabbit:template id="computeAmqpTemplate"
    connection-factory="rabbitConnectionFactory" routing-key="computeQueue"
    reply-timeout="${compute.partition.timeout}">
</rabbit:template>

<int:channel id="computeOutboundChannel">
    <int:dispatcher task-executor="taskExecutor" />
</int:channel>

<int:channel id="computeInboundStagingChannel" />

<amqp:outbound-gateway request-channel="computeOutboundChannel"
    reply-channel="computeInboundStagingChannel" amqp-template="computeAmqpTemplate"
    mapped-request-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS"
    mapped-reply-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS" />


<beans:bean id="computeMessagingTemplate"
    class="org.springframework.integration.core.MessagingTemplate"
    p:defaultChannel-ref="computeOutboundChannel"
    p:receiveTimeout="${compute.partition.timeout}" />


<beans:bean id="computePartitionHandler"
    class="org.springframework.batch.integration.partition.MessageChannelPartitionHandler"
    p:stepName="computeStep" p:gridSize="${compute.grid.size}"
    p:messagingOperations-ref="computeMessagingTemplate" />

<int:aggregator ref="computePartitionHandler"
    send-partial-result-on-expiry="true" send-timeout="${compute.step.timeout}"
    input-channel="computeInboundStagingChannel" />

<amqp:inbound-gateway concurrent-consumers="${compute.consumer.concurrency}"
    request-channel="computeInboundChannel" 
    reply-channel="computeOutboundStagingChannel" queue-names="computeQueue"
    connection-factory="rabbitConnectionFactory"
    mapped-request-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS"
    mapped-reply-headers="correlationId, sequenceNumber, sequenceSize, STANDARD_REQUEST_HEADERS" />


<int:channel id="computeInboundChannel" />

<int:service-activator ref="stepExecutionRequestHandler"
    input-channel="computeInboundChannel" output-channel="computeOutboundStagingChannel" />

<int:channel id="computeOutboundStagingChannel" />

<beans:bean id="computePartitioner"
    class="org.springframework.batch.core.partition.support.MultiResourcePartitioner"
    p:resources="file:${spring.tmp.batch.dir}/#{jobParameters[batch_id]}/shares_rics/shares_rics_*.txt"
    scope="step" />



<beans:bean id="computeFileItemReader"
    class="org.springframework.batch.item.file.FlatFileItemReader"
    p:resource="#{stepExecutionContext[fileName]}" p:lineMapper-ref="stLineMapper"
    scope="step" />

<beans:bean id="computeItemWriter"
    class="com.st.batch.foundation.writers.ComputeItemWriter"
    p:symfony-ref="symfonyStepScoped" p:timeout="${compute.item.timeout}"
    p:batchId="#{jobParameters[batch_id]}" scope="step" />


<step id="computeStep">
    <tasklet transaction-manager="transactionManager">
        <chunk reader="computeFileItemReader" writer="computeItemWriter"
            commit-interval="${compute.commit.interval}" />
    </tasklet>
</step>

<flow id="computeFlow">
    <step id="computeStep.master">
        <partition partitioner="computePartitioner"
            handler="computePartitionHandler" />
    </step>
</flow>

<job id="computeJob" restartable="true">
    <flow id="computeJob.computeFlow" parent="computeFlow" />
</job>



compute.grid.size = 112
compute.consumer.concurrency = 10

Input files are splited to 112 equal parts = compute.grid.size = total number of partitions

Number of servers = 4.

<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
    <property name="jobRepository" ref="jobRepository" />
    <property name="taskExecutor" ref="jobLauncherTaskExecutor" />
</bean>

<task:executor id="jobLauncherTaskExecutor" pool-size="6" rejection-policy="ABORT" />

那里的游泳池大小是6，这与上述问题有关吗

或者tomcat 7中是否存在将运行的线程数限制为8的情况？

感到困惑-您说“我已将并发设置为10”，但随后显示

compute.consumer.concurrency=8

。因此，它正在按配置工作。如果属性设置为10，则不可能只有8个使用者线程

从Rabbit的角度来看，所有的消费者都是平等的——如果在一个慢盒子上有10个消费者，在一个快盒子上有10个消费者，而您只有10个分区，那么所有10个分区都有可能在慢盒子上结束

RabbitMQ不跨服务器分发工作，它只跨消费者分发工作

通过减少并发性，您可能会获得更好的分发。您还应该在较慢的框中设置较低的并发性。

您是否使用数据库作为JobRepository

在执行过程中，批处理框架会持续执行步骤执行，而到JobRepository数据库的连接数可能会干扰并行步骤执行

8的并发性使我认为您可能正在使用

BasicDataSource

？如果是这样，请切换到类似于

DriverManagerDataSource

的内容，并参阅

对不起，有问题。我的配置中的值实际上是10。是的，我确实将并发性设置得更低。但理想情况下，如果较慢服务器上的用户正忙着，则其他服务器上的用户应接收消息。看起来，较慢服务器上的用户会不断收到消息，即使较快服务器上的用户空闲地坐着，更新问题中有更多信息，spring批处理/spring批处理管理问题如果

并发使用者为10，则将有10个线程。时期SI/SB中没有将其限制为8的内容。正如我所说，Rabbit不知道下一个消费者是在繁忙还是空闲的服务器上。听起来你有太多的消费者来满足你的需求。如果一些批作业需要比其他分区更多的分区，考虑使用不同的配置。理想情况下，它应该运行10，但它不运行。我又补充了一个观察。嗨，维沙，我也有同样的问题。这个问题解决了吗？如果是这样的话，我可以知道你的解决方案吗。。。。。