C# 下载网页并另存为UTF-8文本文件

C# 下载网页并另存为UTF-8文本文件,c#,utf-8,C#,Utf 8,我下载了一个网页,如下所示。我想将其保存为UTF-8文本。但是怎么做呢? HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url); using (HttpWebResponse resp = (HttpWebResponse)req.GetResponse()) { Encoding enc = Encoding.GetEncoding(resp.CharacterSet); Encoding utf8 = Enco

我下载了一个网页,如下所示。我想将其保存为UTF-8文本。但是怎么做呢?

HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
using (HttpWebResponse resp = (HttpWebResponse)req.GetResponse())
{
    Encoding enc = Encoding.GetEncoding(resp.CharacterSet);
    Encoding utf8 = Encoding.UTF8;
    using (StreamWriter w = new StreamWriter(new FileStream(pathname, FileMode.Create), utf8))
    {
        using (StreamReader r = new StreamReader(resp.GetResponseStream()))
        {
            // This works, but it's bad because you read the whole response into memory:
            string s = r.ReadToEnd();
            w.Write(s);

            // This doesn't work :(
            char[] buffer = new char[1024];
            int n;
            while (!r.EndOfStream)
            {
                n = r.ReadBlock(buffer, 0, 1024);
                w.Write(utf8.GetChars(Encoding.Convert(enc, utf8, enc.GetBytes(buffer))));
            }

            // This means that r.ReadToEnd() is doing the transcoding to UTF-8 differently.
            // But how?!
        }
    }
    return resp.StatusCode;
}

不要读这一段。它只是用来消除关于代码过多的警告信息。

您只需使用WebClient类即可。它支持编码并更易于使用:

WebClient webClient = new WebClient();
webClient.Encoding = System.Text.Encoding.UTF8;
webClient.DownloadFile(url, "file.txt");

看这个呃。。你的代码到底出了什么问题?在什么情况下它会表现不好?你需要玩弄编码吗?在包含自己的编码之前,您是否遇到问题?我之所以这么问,是因为很少有人真的需要对编码做很多工作,大部分情况下,框架本身都能很好地处理它。