Amazon s3 执行多个下载并等待所有下载完成_Amazon S3_Java 8_Concurrent Programming

Amazon s3 执行多个下载并等待所有下载完成

amazon-s3 java-8

Amazon s3 执行多个下载并等待所有下载完成,amazon-s3,java-8,concurrent-programming,Amazon S3,Java 8,Concurrent Programming,我目前正在开发一个API服务，它允许一个或多个用户从S3存储桶下载一个或多个项目，并将内容返回给用户。虽然下载很好，但下载几个文件所需的时间几乎是100-150毫秒*文件的数量我尝试了几种方法来加速这个过程-parallelStream（）而不是stream（）（考虑到同时下载的数量，stream（）处于a），还有CompletableFutures，甚至创建了ExecutorService，完成下载后关闭池。通常，我只希望每个请求同时执行几个并发任务，例如5个，以尝试减少活动线程的数量我曾

我目前正在开发一个API服务，它允许一个或多个用户从S3存储桶下载一个或多个项目，并将内容返回给用户。虽然下载很好，但下载几个文件所需的时间几乎是100-150毫秒*文件的数量

我尝试了几种方法来加速这个过程-parallelStream（）而不是stream（）（考虑到同时下载的数量，stream（）处于a），还有CompletableFutures，甚至创建了ExecutorService，完成下载后关闭池。通常，我只希望每个请求同时执行几个并发任务，例如5个，以尝试减少活动线程的数量

我曾尝试集成Spring@Cacheable将下载的文件存储到Redis（这些文件是只读的）——虽然这确实减少了响应时间（检索文件需要几毫秒，而检索文件需要100-150毫秒），但只有在以前检索完文件后，才有好处

考虑到我不想（或不认为我可以）让数百个线程同时打开http连接和下载，处理等待多个异步任务完成然后获得结果的最佳方法是什么？

您担心的是，在并行流中默认使用的公共fork/join池会被占用，因为我相信它用于流api之外的排序操作。您可以为流创建自己的fork/join池，而不是用I/O绑定的并行流使公共fork/join池饱和。请参阅以了解如何创建具有所需大小的临时池并在其中运行并行流

您还可以使用固定大小的线程池创建ExecutorService，该线程池也将独立于公共fork/join池，并且将仅使用池中的线程来限制请求。它还允许您指定要专用的线程数：

ExecutorService executor = Executors.newFixedThreadPool(MAX_THREADS_FOR_DOWNLOADS);
try {
    List<CompletableFuture<Path>> downloadTasks = s3Paths
            .stream()
            .map(s3Path -> completableFuture.supplyAsync(() -> mys3Downloader.downloadAndGetPath(s3Path), executor))
            .collect(Collectors.toList());    

        // at this point, all requests are enqueued, and threads will be assigned as they become available      

        executor.shutdown();    // stops accepting requests, does not interrupt threads, 
                                // items in queue will still get threads when available

        // wait for all downloads to complete
        CompletableFuture.allOf(downloadTasks.toArray(new CompletableFuture[downloadTasks.size()])).join();

        // at this point, all downloads are finished, 
        // so it's safe to shut down executor completely

    } catch (InterruptedException | ExecutionException e) {
        e.printStackTrace();
    } finally {
        executor.shutdownNow(); // important to call this when you're done with the executor.
    }

ExecutorService executor=Executors.newFixedThreadPool（最大线程数，用于下载）；
试一试{
列表下载任务=S3路径
.stream（）
.map（s3Path->completableFuture.supplyAsync（（）->mys3Downloader.download和getpath（s3Path），executor））
.collect（Collectors.toList（））；
//此时，所有请求都已排队，线程可用时将被分配
executor.shutdown（）；//停止接受请求，不中断线程，
//队列中的项目在可用时仍将获得线程
//等待所有下载完成
CompletableFuture.allOf（downloadstasks.toArray（新的CompletableFuture[downloadstasks.size（）））.join（）；
//此时，所有下载都已完成，
//所以完全关闭executor是安全的
}捕获（中断异常|执行异常e）{
e、 printStackTrace（）；
}最后{
executor.shutdownNow（）；//处理完executor后调用此函数很重要。
}

在@Hank D的带领下，您可以封装executor服务的创建，以确保在使用所述executor后确实调用executor service:：Shutdownow：

private static <VALUE> VALUE execute(
  final int nThreads,
  final Function<ExecutorService, VALUE> function
) {
  ExecutorService executorService = Executors.newFixedThreadPool(nThreads);
  try {
    return function.apply(executorService);
  } catch (final InterruptedException | ExecutionException exception) {
    exception.printStackTrace();
  } finally {
    executorService .shutdownNow(); // important to call this when you're done with the executor service.
  }
}

public static void main(final String... arguments) {
  // define variables
  final List<CompletableFuture<Path>> downloadTasks = execute(
    MAX_THREADS_FOR_DOWNLOADS,
    executor -> s3Paths
      .stream()
      .map(s3Path -> completableFuture.supplyAsync(
        () -> mys3Downloader.downloadAndGetPath(s3Path),
        executor
      ))
      .collect(Collectors.toList())
  );
  // use downloadTasks
}

private静态值执行(
最后一句话：，
最终函数
) {
ExecutorService ExecutorService=Executors.newFixedThreadPool（nThreads）；
试一试{
返回函数。应用（执行器服务）；
}捕获（最终中断异常|执行异常）{
异常。printStackTrace（）；
}最后{
executorService.shutdownNow（）；//完成executor服务后调用此函数很重要。
}
}
公共静态void main（最终字符串…参数）{
//定义变量
最终列表下载任务=执行(
用于下载的最大线程数，
执行器->S3路径
.stream（）
.map（s3Path->completableFuture.SupplySync(
（）->mys3Downloader.download和getpath（s3Path），
执行人
))
.collect（收集器.toList（））
);
//使用下载任务
}

我认为这两种方法的结合是最好的方法。我还喜欢@srborlogan将关闭封装在一个helper函数中的想法——这肯定会使它看起来更整洁，并且可能是可重用的