如何在C#…中了解HTML Agility Pack中XML文件的加载状态。。。?

如何在C#…中了解HTML Agility Pack中XML文件的加载状态。。。?,c#,.net,visual-studio,xml-parsing,html-agility-pack,C#,.net,Visual Studio,Xml Parsing,Html Agility Pack,我正在使用C#中的HTML Agility Pack从XML中提取电子邮件、文章标题和姓名数据。我能够成功地为较小的文件(大小约为100 MB)刮取数据。但当文件大小较大(500 MB)时,无法提取数据。 我不知道我哪里做错了 1。如何在解析之前知道文件是否已加载。 HtmlAgilityPack.HtmlDocument someDoc = new HtmlAgilityPack.HtmlDocument(); someDoc.Load(@"C:\Users

我正在使用C#中的HTML Agility Pack从XML中提取电子邮件、文章标题和姓名数据。我能够成功地为较小的文件(大小约为100 MB)刮取数据。但当文件大小较大(500 MB)时,无法提取数据。

我不知道我哪里做错了

1。如何在解析之前知道文件是否已加载。

     HtmlAgilityPack.HtmlDocument someDoc = new HtmlAgilityPack.HtmlDocument();
        someDoc.Load(@"C:\Users\Raj\Desktop\pmc_result.xml");
        HtmlNodeCollection articles = someDoc.DocumentNode.SelectNodes("//article/front/article-meta");
        foreach (var node in articles)
        {
            string email, name, title;
            try
            {
                email = "";
                name = "";
                title = "";
                try
                {
                    email = node.SelectSingleNode("./contrib-group/contrib/address/email").InnerText;
                    title = node.SelectSingleNode("./title-group/article-title").InnerText;
                    name = node.SelectSingleNode("./contrib-group/contrib/name/surname").InnerText + " " + 
                        node.SelectSingleNode("./contrib-group/contrib/name/given-names").InnerText ;

                    listView1.Items.Add(email);
                    listView1.Items[listView1.Items.Count - 1].SubItems.Add(name);
                    listView1.Items[listView1.Items.Count - 1].SubItems.Add(title);
                }
                catch (Exception ex) { }
               // textBox1.Text += node.SelectSingleNode("./surname").InnerText + Environment.NewLine ;
            }
            catch (Exception ex) { }

        }
2。了解解析的状态。

     HtmlAgilityPack.HtmlDocument someDoc = new HtmlAgilityPack.HtmlDocument();
        someDoc.Load(@"C:\Users\Raj\Desktop\pmc_result.xml");
        HtmlNodeCollection articles = someDoc.DocumentNode.SelectNodes("//article/front/article-meta");
        foreach (var node in articles)
        {
            string email, name, title;
            try
            {
                email = "";
                name = "";
                title = "";
                try
                {
                    email = node.SelectSingleNode("./contrib-group/contrib/address/email").InnerText;
                    title = node.SelectSingleNode("./title-group/article-title").InnerText;
                    name = node.SelectSingleNode("./contrib-group/contrib/name/surname").InnerText + " " + 
                        node.SelectSingleNode("./contrib-group/contrib/name/given-names").InnerText ;

                    listView1.Items.Add(email);
                    listView1.Items[listView1.Items.Count - 1].SubItems.Add(name);
                    listView1.Items[listView1.Items.Count - 1].SubItems.Add(title);
                }
                catch (Exception ex) { }
               // textBox1.Text += node.SelectSingleNode("./surname").InnerText + Environment.NewLine ;
            }
            catch (Exception ex) { }

        }
这是我使用的代码…

     HtmlAgilityPack.HtmlDocument someDoc = new HtmlAgilityPack.HtmlDocument();
        someDoc.Load(@"C:\Users\Raj\Desktop\pmc_result.xml");
        HtmlNodeCollection articles = someDoc.DocumentNode.SelectNodes("//article/front/article-meta");
        foreach (var node in articles)
        {
            string email, name, title;
            try
            {
                email = "";
                name = "";
                title = "";
                try
                {
                    email = node.SelectSingleNode("./contrib-group/contrib/address/email").InnerText;
                    title = node.SelectSingleNode("./title-group/article-title").InnerText;
                    name = node.SelectSingleNode("./contrib-group/contrib/name/surname").InnerText + " " + 
                        node.SelectSingleNode("./contrib-group/contrib/name/given-names").InnerText ;

                    listView1.Items.Add(email);
                    listView1.Items[listView1.Items.Count - 1].SubItems.Add(name);
                    listView1.Items[listView1.Items.Count - 1].SubItems.Add(title);
                }
                catch (Exception ex) { }
               // textBox1.Text += node.SelectSingleNode("./surname").InnerText + Environment.NewLine ;
            }
            catch (Exception ex) { }

        }

有人请帮帮我。

什么叫“提取失败”?你有例外吗?您是否得到错误的结果?异常
System.OutOfMemoryException:“引发了类型为“System.OutOfMemoryException”的异常。”
@someDoc.Load(@“C:\Users\Raj\Desktop\pmc.xml”);但是,它使用XmlDocument xDoc=newxmldocument();加载(@“C:\Users\Raj\Desktop\pmc.xml”);但我仍然想知道为什么HAP会出现记忆障碍。感谢您的回复。如果您的应用程序没有分配足够的内存,当您尝试将更多内容加载到内存中时,它将在逻辑上失败。我建议您在互联网站上搜索如何增加可用内存或如何调查潜在内存泄漏。