C# 多线程文件压缩_C#_Multithreading_.net 3.5

C# 多线程文件压缩

c# multithreading

C# 多线程文件压缩,c#,multithreading,.net-3.5,C#,Multithreading,.net 3.5,我刚开始使用线程，我想写一个简单的文件压缩程序。它应该创建两个背景线程——一个用于阅读，另一个用于写作。第一个应该按小块读取文件并将它们放入队列，其中int-是chunkId。第二个线程应该将块出列，并将它们按顺序（使用chunkId）写入输出流（该线程在begin中创建的文件）我做到了。但我不明白为什么在我的程序结束后，我打开了我的gzip文件——我明白了，我的块混合了，文件没有以前的顺序 public static class Reader { private static re

我刚开始使用线程，我想写一个简单的文件压缩程序。它应该创建两个背景线程——一个用于阅读，另一个用于写作。第一个应该按小块读取文件并将它们放入队列，其中int-是chunkId。第二个线程应该将块出列，并将它们按顺序（使用chunkId）写入输出流（该线程在begin中创建的文件）

我做到了。但我不明白为什么在我的程序结束后，我打开了我的gzip文件——我明白了，我的块混合了，文件没有以前的顺序

public static class Reader
{
    private static readonly object Locker = new object();

    private const int ChunkSize = 1024*1024;

    private static readonly int MaxThreads;
    private static readonly Queue<KeyValuePair<int, byte[]>> ChunksQueue;
    private static int _chunksComplete;

    static Reader()
    {
        MaxThreads = Environment.ProcessorCount;
        ChunksQueue = new Queue<KeyValuePair<int,byte[]>>(MaxThreads);
    }

    public static void Read(string filename)
    {
        _chunksComplete = 0;

        var tRead = new Thread(Reading) { IsBackground = true };
        var tWrite = new Thread(Writing) { IsBackground = true };

        tRead.Start(filename);
        tWrite.Start(filename);

        tRead.Join();
        tWrite.Join();

        Console.WriteLine("Finished");
    }

    private static void Writing(object threadContext)
    {
        var filename = (string) threadContext;

        using (var s = File.Create(filename + ".gz"))
        {
            while (true)
            {
                var dataPair = DequeueSafe();
                if (dataPair.Value == null)
                    return;

                while (dataPair.Key != _chunksComplete)
                {
                    Thread.Sleep(1);
                }

                Console.WriteLine("write chunk {0}", dataPair.Key);

                using (var gz = new GZipStream(s, CompressionMode.Compress, true))
                {
                    gz.Write(dataPair.Value, 0, dataPair.Value.Length);
                }

                _chunksComplete++;
            }
        }
    }

    private static void Reading(object threadContext)
    {
        var filename = (string) threadContext;

        using (var s = File.OpenRead(filename))
        {
            var counter = 0;
            var buffer = new byte[ChunkSize];
            while (s.Read(buffer, 0, buffer.Length) != 0)
            {
                while (ChunksQueue.Count == MaxThreads)
                {
                    Thread.Sleep(1);
                }

                Console.WriteLine("read chunk {0}", counter);

                var dataPair = new KeyValuePair<int, byte[]>(counter, buffer);

                EnqueueSafe(dataPair);

                counter++;
            }

            EnqueueSafe(new KeyValuePair<int, byte[]>(0, null));
        }
    }

    private static void EnqueueSafe(KeyValuePair<int, byte[]> dataPair)
    {
        lock (ChunksQueue)
        {
            ChunksQueue.Enqueue(dataPair);
        }
    }

    private static KeyValuePair<int, byte[]> DequeueSafe()
    {
        while (true)
        {
            lock (ChunksQueue)
            {
                if (ChunksQueue.Count > 0)
                {
                    return ChunksQueue.Dequeue();
                }
            }

            Thread.Sleep(1);
        }
    } 
}

公共静态类读取器
{
私有静态只读对象锁定器=新对象（）；
私有常量int ChunkSize=1024*1024；
私有静态只读int-MaxThreads；
私有静态只读队列；
私有静态int_chunksComplete；
静态读取器（）
{
MaxThreads=Environment.ProcessorCount；
ChunksQueue=新队列（MaxThreads）；
}
公共静态无效读取（字符串文件名）
{
_chunksComplete=0；
var-tRead=新线程（正在读取）{IsBackground=true}；
var tWrite=新线程（写入）{IsBackground=true}；
stread.Start（文件名）；
tWrite.Start（文件名）；
tRead.Join（）；
tWrite.Join（）；
控制台。写入线（“完成”）；
}
私有静态无效写入（对象线程上下文）
{
var filename=（字符串）threadContext；
使用（var s=File.Create（filename+“.gz”））
{
while（true）
{
var dataPair=DequeueSafe（）；
if（dataPair.Value==null）
返回；
while（dataPair.Key！=\u chunksComplete）
{
睡眠（1）；
}
WriteLine（“write chunk{0}”，dataPair.Key）；
使用（var gz=new GZipStream（s，CompressionMode.Compress，true））
{
Write（dataPair.Value，0，dataPair.Value.Length）；
}
_chunksComplete++；
}
}
}
私有静态无效读取（对象线程上下文）
{
var filename=（字符串）threadContext；
使用（var s=File.OpenRead（文件名））
{
var计数器=0；
var buffer=新字节[ChunkSize]；
while（s.Read（buffer，0，buffer.Length）！=0）
{
while（ChunksQueue.Count==MaxThreads）
{
睡眠（1）；
}
WriteLine（“读取块{0}”，计数器）；
var数据对=新的KeyValuePair（计数器、缓冲区）；
排队安全（数据对）；
计数器++；
}
EnqueueSafe（新的KeyValuePair（0，null））；
}
}
私有静态void排队安全（KeyValuePair数据对）
{
锁定（ChunksQueue）
{
ChunksQueue.Enqueue（数据对）；
}
}
私有静态KeyValuePair出列安全（）
{
while（true）
{
锁定（ChunksQueue）
{
如果（ChunksQueue.Count>0）
{
返回chunksquee.Dequeue（）；
}
}
睡眠（1）；
}
} 
}

UPD：我只能使用.NET 3.5。读取（）返回它所消耗的实际字节数。使用它来限制写入程序的块大小。而且，由于涉及到并发读写，您将需要不止一个缓冲区。尝试使用4096作为块大小

读者：

var buffer = new byte[ChunkSize]; 
int bytesRead = s.Read(buffer, 0, buffer.Length);

while (bytesRead != 0)
{  
   ...
   var dataPair = new KeyValuePair<int, byte[]>(bytesRead, buffer); 
   buffer = new byte[ChunkSize];
   bytesRead = s.Read(buffer, 0, buffer.Length);
}

PS：通过添加可用数据缓冲池而不是每次分配新缓冲区，并使用事件（例如

ManualResetEvent

）来通知队列为空，队列已满而不是使用

Thread.Sleep（）可以提高性能

虽然它确实提出了一个非常重要的观点，那就是

流。读

可能会用比您要求的字节更少的字节填充

缓冲区

，但您的主要问题是您只有一个
字节[]
您会反复使用。

当您的读取循环转到读取第二个值时，它将覆盖传递给队列的

数据对中的字节[]
。您必须有一个buffer=新字节[ChunkSize]循环内部编码>以解决此问题。您还必须记录读取的字节数，并且只写入相同数量的字节
您不需要将计数器
作为队列保留在对中
将维护顺序，使用对中的int
存储记录的字节数，如alexm的示例所示。
您应该切换到阻塞集合
，并将ConcurrentQueue
作为基础集合。您将不再需要锁定，也不再需要线程.Sleep（1）
，因为如果没有可用的数据，它将等待数据到达。我忘了说我只能使用.net 3.5！为什么不简单地读取一个缓冲区，然后在压缩+写入该缓冲区的同时读取下一个？只需要一个线程，您可以使用异步I/O，并且如果compress+write线程比reading线程花费更多的时间来完成其工作，则不必冒险用您读取的缓冲区填充队列。这只是一种实践。这个例子与现实生活中的例子没有任何联系。但是谢谢你的想法！Stream.Read
的一个很好的技巧我学会了合并两个读取调用，使用for（int-bytesRead=s.Read（buffer，0，buffer.Length）；bytesRead！=0；bytesRead=s.Read（buffer，0，buffer.Length））{…}
代替while循环。@alexm so。。。因为我同时阅读和写作，我的词块混合了吗？我说的对吗？@matterai，它们有点混合，因为你在哪里覆盖了你传入队列的字节[]，如果你没有并发读写，你可能会得到aw
 gz.Write(dataPair.Value, 0, dataPair.Key)