部分下载并序列化C#中的大文件？_C#_Serialization_Download

部分下载并序列化C#中的大文件？

c# serialization download

部分下载并序列化C#中的大文件？,c#,serialization,download,C#,Serialization,Download,作为我大学即将开展的项目的一部分，我需要编写一个客户端，从服务器下载媒体文件并将其写入本地磁盘。由于这些文件可能非常大，我需要实现部分下载和序列化，以避免过度使用内存我想到的是： namespace PartialDownloadTester { using System; using System.Diagnostics.Contracts; using System.IO; using System.Net; using System.Text;

作为我大学即将开展的项目的一部分，我需要编写一个客户端，从服务器下载媒体文件并将其写入本地磁盘。由于这些文件可能非常大，我需要实现部分下载和序列化，以避免过度使用内存

我想到的是：

namespace PartialDownloadTester
{
    using System;
    using System.Diagnostics.Contracts;
    using System.IO;
    using System.Net;
    using System.Text;

    public class DownloadClient
    {
        public static void Main(string[] args)
        {
            var dlc = new DownloadClient(args[0], args[1], args[2]);
            dlc.DownloadAndSaveToDisk();
            Console.ReadLine();
        }

        private WebRequest request;

        // directory of file
        private string dir;

        // full file identifier
        private string filePath;

        public DownloadClient(string uri, string fileName, string fileType)
        {
            this.request = WebRequest.Create(uri);
            this.request.Method = "GET";
            var sb = new StringBuilder();
            sb.Append("C:\\testdata\\DownloadedData\\");
            this.dir = sb.ToString();
            sb.Append(fileName + "." + fileType);
            this.filePath = sb.ToString();
        }

        public void DownloadAndSaveToDisk()
        {
            // make sure directory exists
            this.CreateDir();

            var response = (HttpWebResponse)request.GetResponse();
            Console.WriteLine("Content length: " + response.ContentLength);
            var rStream = response.GetResponseStream();
            int bytesRead = -1;
            do
            {
                var buf = new byte[2048];
                bytesRead = rStream.Read(buf, 0, buf.Length);
                rStream.Flush();
                this.SerializeFileChunk(buf);
            }
            while (bytesRead != 0);
        }

        private void CreateDir()
        {
            if (!Directory.Exists(dir))
            {
                Directory.CreateDirectory(dir);
            }
        }

        private void SerializeFileChunk(byte[] bytes)
        {
            Contract.Requires(!Object.ReferenceEquals(bytes, null));
            FileStream fs = File.Open(filePath, FileMode.Append);
            fs.Write(bytes, 0, bytes.Length);
            fs.Flush();
            fs.Close();
        }
    }
}

出于测试目的，我使用了以下参数：

"http://itu.dk/people/janv/mufc_abc.jpg“mufc_abc”“jpg”

然而，即使内容长度打印出图像的实际大小为63780，图片仍然不完整（只有前10%的图片看起来是正确的）

所以我的问题是：

这是部分下载和序列化的正确方法还是有更好/更简单的方法

响应流的全部内容是否存储在客户端内存中？如果是这种情况，我是否需要使用HttpWebRequest.AddRange从服务器部分下载数据以节省客户端内存

为什么序列化失败，我得到了一个损坏的图像

当我使用FileMode.Append时，会引入很多开销吗？（msdn声明此选项“搜索到文件末尾”）

提前感谢

以下是Microsoft提供的解决方案：

更新日期：2021-03-16：原文似乎现在不可用。下面是存档的一个：

您完全可以使用以下方法简化代码：

请注意，我是如何只将实际从套接字读取的字节数写入输出文件，而不是写入整个2KB缓冲区的。

我不知道这是否是问题的根源，但是我会像这样更改循环

const int ChunkSize = 2048;
var buf = new byte[ChunkSize];
var rStream = response.GetResponseStream();
do {
    int bytesRead = rStream.Read(buf, 0, ChunkSize);
    if (bytesRead > 0) {
        this.SerializeFileChunk(buf, bytesRead);
    }
} while (bytesRead == ChunkSize);

{
    var rStream = response.GetResponseStream();
    try
    {
        // do some work with rStream here.
    } finally {
        if (rStream != null) {
            rStream.Dispose();
        }
    }
}

serialize方法将获得一个附加参数

private void SerializeFileChunk(byte[] bytes, int numBytes)

然后写入正确的字节数

fs.Write(bytes, 0, numBytes);

更新：

我不认为每次都需要关闭和重新打开文件。我还将使用

using

语句，它关闭资源，即使发生异常。

using

语句在末尾调用资源的

Dispose（）

方法，对于文件流，该方法反过来调用

Close（）

<代码>使用可应用于实现IDisposable的所有类型

var buf = new byte[2048];
using (var rStream = response.GetResponseStream()) {
    using (FileStream fs = File.Open(filePath, FileMode.Append)) {
        do {
            bytesRead = rStream.Read(buf, 0, buf.Length);
            fs.Write(bytes, 0, bytesRead);
        } while (...);
    }
}

using语句的作用如下

const int ChunkSize = 2048;
var buf = new byte[ChunkSize];
var rStream = response.GetResponseStream();
do {
    int bytesRead = rStream.Read(buf, 0, ChunkSize);
    if (bytesRead > 0) {
        this.SerializeFileChunk(buf, bytesRead);
    }
} while (bytesRead == ChunkSize);

{
    var rStream = response.GetResponseStream();
    try
    {
        // do some work with rStream here.
    } finally {
        if (rStream != null) {
            rStream.Dispose();
        }
    }
}

你为什么不简单地使用

client.DownloadFile

@L.B呢？教授大概想让他们学习和理解流处理。也许教授想让他们使用

Range

头（在Q:

中，我需要实现部分下载

）谢谢，我没有WebClient类的知识-它看起来非常方便。我认为你是对的，损坏的图像问题是由我写整个缓冲区引起的@注：msdn并没有说明这是如何实现的。它确实下载并写入了部分文件？（我不必使用流。我选择使用流，因为到目前为止我是通过这种方式学会数据传输的。）你是对的；我确实需要将readBytes作为参数传递给SerializeFileChunk，以便修复fs.Write调用以处理readBytes，而不是字节数组的长度。非常感谢。（编辑：在引入SerializeFileChunk的第二个参数后，循环不一定需要您建议的更改）我添加了一些解释，并将字节缓冲区创建放在循环之前，因为在每次迭代中创建新缓冲区没有意义。有完整源代码的解决方案吗？找不到页面。@Elshan