Java 大规模任务可运行或可调用的替代模式_Java_Multithreading

Java 大规模任务可运行或可调用的替代模式

java multithreading

Java 大规模任务可运行或可调用的替代模式,java,multithreading,Java,Multithreading,对于大规模并行计算，我倾向于使用执行器和可调用函数。当我有数千个要计算的对象时，我觉得为每个对象实例化数千个可运行的对象不是很好所以我有两种方法来解决这个问题： I。将工作负载分成少量的x-Worker，每个x-Worker提供y对象。（将对象列表拆分为x分区，每个分区的大小为y/x）公共静态列表分区（列表、int块）{ 最终ArrayList列表=新ArrayList（）； final int size=Math.max（1，list.size（）/chunks+1）； final int

对于大规模并行计算，我倾向于使用执行器和可调用函数。当我有数千个要计算的对象时，我觉得为每个对象实例化数千个可运行的对象不是很好

所以我有两种方法来解决这个问题：

I。将工作负载分成少量的x-Worker，每个x-Worker提供y对象。（将对象列表拆分为x分区，每个分区的大小为y/x）

公共静态列表分区（列表、int块）{
最终ArrayList列表=新ArrayList（）；
final int size=Math.max（1，list.size（）/chunks+1）；
final int listSize=list.size（）；
for（int i=0；i创建数千个Runnable
（实现Runnable
的对象）并不比创建普通对象昂贵
创建和运行数千个线程可能非常繁重，但您可以使用执行器和一个线程池来解决此问题。
至于不同的方法，您可能会对java 8感兴趣。
@Aaron是对的，您应该考虑：
您还可以将a作为参数：
 void processInParallel(List<V> list, int chunks, Consumer<V> processor) {
    ForkJoinPool forkJoinPool = new ForkJoinPool(chunks);
    forkJoinPool.submit(() -> {
        list.parallelStream().forEach(item -> processor.accept(item));
    });
}

根据您的需要，ForkJoinPool#submit（）
返回ForkJoinTask
的实例，该实例是一个，您可以使用它检查状态或等待任务结束
您很可能希望只实例化一次ForkJoinPool
（而不是在每次方法调用时都实例化它），然后重用它以防止在多次调用该方法时CPU阻塞。
在这里结合各种答案：
创建数千个Runnable真的很昂贵而且需要避免吗
不，这不是它本身，而是你将如何让它们执行，这可能会被证明是昂贵的（产生几千个线程当然会有它的成本）。
所以您不想这样做：
List<Computation> computations = ...
List<Thread> threads = new ArrayList<>();
for (Computation computation : computations) {
    Thread thread = new Thread(new Computation(computation));
    threads.add(thread);
    thread.start();
}
// If you need to wait for completion:
for (Thread t : threads) {
    t.join();
}

重用有限数量的线程这一事实应该会产生非常好的改进
是否有解决方案II的通用模式/建议
有。首先，您的分区代码（从列表
到列表
）可以在收集工具（如Guava）中找到，具有更通用和防故障的实现
但除此之外，对于你正在实现的目标，我想到了两种模式：
将Fork/Join池与Fork/Join任务一起使用（也就是说，用整个项目列表生成一个任务，每个任务将用该列表的一半生成子任务，直到每个任务管理一个足够小的项目列表）。这是分而治之的方法。请参阅：
如果您的计算是“从列表中添加整数”，它可能看起来像（可能有一个边界错误，我没有真正检查）：
公共静态类加法器扩展递归任务{
受保护列表全球列表；
保护int启动；
受保护的int停止；
公共加法器（列表全局列表、整数开始、整数停止）{
超级（）；
this.globalList=globalList；
this.start=start；
this.stop=停止；
System.out.println（“为“+start+”=>“+stop”创建）；
}
@凌驾
受保护整数计算（）{
如果（停止-启动>1000）{
//参数太多，我们将列表拆分
加法器子任务K1=新加法器（全局列表、开始、开始+（停止-开始）/2）；
加法器子任务K2=新加法器（全局列表，开始+（停止-开始）/2，停止）；
子任务2.fork（）；
返回subTask1.compute（）+subTask2.join（）；
}否则{
//可管理的参数大小，我们处理到位
int结果=0；
for（int i=开始；i<停止；i++）{
结果+=i；
}
返回结果；
}
}
}
public void doWork（）引发异常{
列表计算=新建ArrayList（）；
对于（int i=0；i<10000；i++）{
增加（i）；
}
ForkJoinPool池=新的ForkJoinPool（）；
RecursiveTask masterTask=新加法器（计算，0，计算.size（））；
Future=pool.submit（主任务）；
System.out.println（future.get（））；
}

使用Java8并行流可以轻松启动多个并行计算（实际上，Java并行流可以返回到Fork/Join池）
其他人已经展示了这可能是什么样子
你知道另一种方法吗
对于并发编程（没有显式的任务/线程处理）的不同理解，请查看actor模式。
Akka作为此模式的流行实现出现在脑海中
创建数千个Runnable真的很昂贵而且需要避免吗
一点也不，可运行/可调用接口只有一个方法来实现每个任务，每个任务中的“额外”代码量取决于您正在运行的代码。但可运行/可调用接口肯定没有错误
是否有解决方案II的通用模式/建议
模式2比模式1更有利。这是因为模式1假设每个工作人员将在完全相同的时间完成。如果一些工作人员在其他工作人员之前完成，他们可能只是处于空闲状态，因为他们只能在您分配给他们每个人的y/x大小的队列上工作。但是，在模式2中，您将ver具有空闲工作线程（除非到达工作队列的末尾并且numWorkItems
使用首选模式模式2的一种简单方法是使用ExecutorService invokeAll（Collection对并行计算了解不多，但Runnable
是一个接口，就像Callable
是一个接口一样。制作数千个这样的接口并不比制作数千个这样的接口更贵或更便宜。您不想做什么（但是，根据您的问题，我猜您已经没有这么做了）是创建和销毁数千个线程
。创建线程
的成本很高。您应该采用一些方法（ExecutorService、fork/join、parallel streams）将少数线程重复使用数千次。
void processInParallel(List<V> list, int chunks) {
    ForkJoinPool forkJoinPool = new ForkJoinPool(chunks);
    forkJoinPool.submit(() -> {
        list.parallelStream().forEach(item -> {
            // do something with each item
        });
    });
}

 void processInParallel(List<V> list, int chunks, Consumer<V> processor) {
    ForkJoinPool forkJoinPool = new ForkJoinPool(chunks);
    forkJoinPool.submit(() -> {
        list.parallelStream().forEach(item -> processor.accept(item));
    });
}

void processInParallel(List<V> list, int chunks, Consumer<V> processor) {
    new ForkJoinPool(chunks).submit(() -> list.parallelStream().forEach(processor::accept));
}

processInParallel(myList, 2, item -> {
    // do something with each item
});

List<Computation> computations = ...
List<Thread> threads = new ArrayList<>();
for (Computation computation : computations) {
    Thread thread = new Thread(new Computation(computation));
    threads.add(thread);
    thread.start();
}
// If you need to wait for completion:
for (Thread t : threads) {
    t.join();
}

List<Computation> computations = ...
ExecutorService pool = Executors.newFixedSizeThreadPool(someNumber)
List<Future<Result>> results = new ArrayList<>();
for (Computation computation : computations) {
    results.add(pool.submit(new ComputationCallable(computation));
}
for (Future<Result> result : results {
    doSomething(result.get);
}

public static class Adder extends RecursiveTask<Integer> {
protected List<Integer> globalList;
protected int start;
protected int stop;

public Adder(List<Integer> globalList, int start, int stop) {
  super();
  this.globalList = globalList;
  this.start = start;
  this.stop = stop;
  System.out.println("Creating for " + start + " => " + stop);
}

@Override
protected Integer compute() {
  if (stop - start > 1000) {
    // Too many arguments, we split the list
    Adder subTask1 = new Adder(globalList, start, start + (stop-start)/2);
    Adder subTask2 = new Adder(globalList, start + (stop-start)/2, stop);
    subTask2.fork();
    return subTask1.compute() + subTask2.join();
  } else {
    // Manageable size of arguments, we deal in place
    int result = 0;
    for(int i = start; i < stop; i++) {
      result +=i;
    }
    return result;
  }
}
}

public void doWork() throws Exception {
List<Integer> computation = new ArrayList<>();
for(int i = 0; i < 10000; i++) {
  computation.add(i);
}
ForkJoinPool pool = new ForkJoinPool();

RecursiveTask<Integer> masterTask = new Adder(computation, 0, computation.size());
Future<Integer> future = pool.submit(masterTask);
System.out.println(future.get());

}

List<Callable<?>> workList = // a single list of all of your work
ExecutorService es = Executors.newCachedThreadPool();
es.invokeAll(workList);