在Java中，信号量是限制预取程序任务数量的最佳方法吗？_Java_Multithreading_Concurrency_Semaphore

在Java中，信号量是限制预取程序任务数量的最佳方法吗？

java multithreading concurrency

在Java中，信号量是限制预取程序任务数量的最佳方法吗？,java,multithreading,concurrency,semaphore,Java,Multithreading,Concurrency,Semaphore,背景：我正在设计一个服务，它将文件存储在一系列块中（每个文件可以有任意数量的字节长，分成大小大致相同的块）。其中一个要求是，我正在构建的API必须能够查询文件，以正确的顺序重新组装块，并在输出流中发出它们。为了节省资源但仍提供一定数量的缓冲以提高性能，我决定使用可配置数量的预取器任务来实现这一点，这些任务将在缓存线程池中运行到目前为止我所做的工作：为了控制线程的数量，我最终使用了一个初始化为目标并发级别的信号量。也就是说，如果我想运行3个预取器任务，我将信号量初始化为初始值3。还有一种额外

背景：

我正在设计一个服务，它将文件存储在一系列块中（每个文件可以有任意数量的字节长，分成大小大致相同的块）。其中一个要求是，我正在构建的API必须能够查询文件，以正确的顺序重新组装块，并在输出流中发出它们。为了节省资源但仍提供一定数量的缓冲以提高性能，我决定使用可配置数量的预取器任务来实现这一点，这些任务将在缓存线程池中运行

到目前为止我所做的工作：

为了控制线程的数量，我最终使用了一个初始化为目标并发级别的信号量。也就是说，如果我想运行3个预取器任务，我将信号量初始化为初始值3。还有一种额外的控制机制，可以在将文件块写入输出流时保留文件块顺序，但这超出了我的问题范围

我的问题：

信号量是管理并发预取器数量的最佳方法吗？通常，Java有其他控制机制来避免使用低级并发控制，但是，它似乎真的适合这个特定的实现。我只是想看看是否有更好的选择

代码：

注1:entityID+文件名唯一标识一个文件，可以忽略

注2：零件号（或文件块ID）从0开始并增加

/**
 * Assembles parts of a file into the provided output stream. This implementation
 * utilizes a fixed number of worker threads to pre-fetch data, and then re-assembles
 * the data chunks into the appropriate order when flushing the data to the stream.
 * For example, if a file has 10 chunks, and {@link #mFileFetchConcurrencyLevel} is
 * set to 3, then the first three parts are initially fetched. As parts are written
 * in increasing order, subsequent parts will be fetched. That is, once part 1 is
 * done being written (part 2 and 3 may have finished fetching before or after this,
 * it does not matter) then part 4 will start to be fetched.
 *
 * At the end of the fetch cycle, the output stream will be closed.
 *
 * @param entityID The owning entity
 * @param filename The file requested
 * @param os The stream to which the data will be written
 */
private void assembleFile(String entityID, String filename, OutputStream os) {
    final int nParts = getNumFileParts(entityID, filename);
    final AtomicInteger nextPartToFetch = new AtomicInteger(0);
    final AtomicInteger nextPartToWrite = new AtomicInteger(0);
    final Semaphore permission = new Semaphore(mFileFetchConcurrencyLevel);

    /* trivial case: nothing to do */
    if (nParts == 0) {
        try { os.close(); } catch (Exception e){}
        return;
    }


    while (nextPartToFetch.get() < nParts) {
        /* wait for other threads to finish writing file sections */
        try {
            permission.acquire();
        } catch (InterruptedException e) {
            // (omitting error handling)
            continue;
        }

        /* once permission is granted, start another data pre-fetcher. There are
         * a few steps in this process:
         *  1. Fetch the data
         *  2. Wait until the appropriate time to write the data, since data fetches
         *      may complete out-of-order. Otherwise, the output stream will have
         *      its data written out of order and the file will be corrupted
         *  3. Write the data
         *  4. Close output stream data after all parts are written
         */
        mThreadPool.submit(() -> {
            int responsiblePart = nextPartToFetch.getAndIncrement();
            /* 1. Fetch the data, this may take some time and will block */
            byte[] partData = getFilePart(entityID, filename, responsiblePart);

            synchronized (nextPartToWrite) {
                /* 2. Wait in line to write the data in the appropriate order */
                while (nextPartToWrite.get() != responsiblePart)
                    try { nextPartToWrite.wait(); } catch (InterruptedException e) {}

                /* 3. Write the data */
                try {
                    os.write(partData);
                    os.flush();
                } catch (IOException e) {
                    // (omitting error handling)
                }

                /* this lets other threads know its time to check if they should write data */
                nextPartToWrite.incrementAndGet();
                nextPartToWrite.notifyAll();

                /* we can now allow another prefetch thread to run */
                permission.release();
            }

            /* 4. Close the output stream. The thread with the last part is
             * responsible for closing the stream
             */
            if (responsiblePart == nParts - 1) {
                try {
                    os.close();
                } catch (Exception e) {
                    // (omitting error handling)
                }
            }
        });
    }
}

/**
*将文件的一部分汇编到提供的输出流中。此实现
*利用固定数量的工作线程预取数据，然后重新组装
*在将数据刷新到流中时，数据块会按适当的顺序排列。
*例如，如果一个文件有10个块，并且{@link#mfilefetchconcurrenceylevel}是
*设置为3，则最初提取前三个部分。正如所写的那样
*后续零件将按递增顺序提取。也就是说，一旦第1部分
*正在编写完成（第2部分和第3部分可能在此之前或之后完成获取，
*没关系）然后第4部分将开始提取。
*
*在提取周期结束时，输出流将关闭。
*
*@param entityID拥有的实体
*@param filename请求的文件
*@param os数据将写入的流
*/
私有void assembleFile（字符串entityID、字符串文件名、OutputStream os）{
final int npart=getNumFileParts（entityID，文件名）；
最终AtomicInteger nextPartToFetch=新的AtomicInteger（0）；
最终AtomicInteger nextPartToWrite=新的AtomicInteger（0）；
最终信号量权限=新信号量（mFileFetchConcurrencyLevel）；
/*琐事：无事可做*/
如果（nPart==0）{
尝试{os.close（）；}捕获（异常e）{}
返回；
}
while（nextPartToFetch.get（）{
int responsiblePart=nextPartToFetch.getAndIncrement（）；
/*1.获取数据，这可能需要一些时间，并且会阻塞*/
字节[]partData=getFilePart（entityID、文件名、responsiblePart）；
已同步（nextPartToWrite）{
/*2.排队等待以适当的顺序写入数据*/
while（nextPartToWrite.get（）！=responsiblePart）
请尝试{nextPartToWrite.wait（）；}捕获（InterruptedException e）{}
/*3.写下数据*/
试一试{
写入（partData）；
os.flush（）；
}捕获（IOE异常）{
//（省略错误处理）
}
/*这让其他线程知道检查是否应该写入数据的时间*/
nextPartToWrite.incrementAndGet（）；
nextPartToWrite.notifyAll（）；
/*我们现在可以允许另一个预取线程运行*/
许可。释放（）；
}
/*4.关闭输出流。最后一部分的线程为
*负责关闭流
*/
如果（responsiblePart==nParts-1）{
试一试{
os.close（）；
}捕获（例外e）{
//（省略错误处理）
}
}
});
}
}

是的，这是常见的方法。您可以推广到可重用的结构，例如，基于优先级顺序的延迟队列，或者通过消息传递进行解耦以生成分布式版本。不过，实现主从模式的方法只有这么多，而且非常简单，不需要在j.u.c中使用专用类。在这种情况下，限制线程池大小对您不起作用？这个线程池是我正在构建的API中许多后台任务的共享资源，我想从其他请求中获益，为每个任务创建缓存线程，而不是为每个请求创建一个唯一的线程池。您可能需要小心确保在所有工作线程完成后主线程完成。我不喜欢阻塞