C# 从html页面源中提取数据_C#_Html_Regex

C# 从html页面源中提取数据

c# html regex

C# 从html页面源中提取数据,c#,html,regex,C#,Html,Regex,我需要从网站上提取某些数据我看了这个youtube视频并大致了解如何编写代码基本上我想做的是非常容易地提取和存储（单选按钮文本）！，很简单，但不容易进入列表从页面源代码然后打印出列表中的元素以下是我根据youtube视频编写的c#代码 using System.Net; using System; using System.Collections.Generic; using System.Text.RegularExpressions; namespace ExtractData

我需要从网站上提取某些数据

我看了这个youtube视频并大致了解如何编写代码

基本上我想做的是非常容易地提取和存储（单选按钮文本）！，很简单，但不容易进入列表

从页面源代码

然后打印出列表中的元素

以下是我根据youtube视频编写的c#代码

using System.Net;
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;

namespace ExtractDataFromWebsite
{
    class Program
    {
        static void Main(string[] args)
        {
            List<string> radioOptions = new List<string>();
            WebClient web = new WebClient();

            // download html from certain website
            string html = web.DownloadString("https://docs.google.com/forms/d/1Mout_ImbF9N16EuCiYOxCrL6MbkUVkIEzijO1PAUQ68/viewform?key=pqbhTz7PIHum_4qKEdbUWVg");

            MatchCollection m1 = Regex.Matches(html, @"<input\stype=/"radio"\sname=/"entry.2362106 / "\svalue="(.+)\sid =/ "group_2362106_"
                , RegexOptions.Singleline);
            foreach (Match m in m1)
            {
                    string radioOption = m.Groups[1].Value;
                    radioOptions.Add(radioOption);
            }
            for (int i=0; i< radioOptions.Count;i++)
                Console.WriteLine(""+ radioOptions[i]);

            Console.ReadKey();
        }
    }
}

使用System.Net；
使用制度；
使用System.Collections.Generic；
使用System.Text.RegularExpressions；
命名空间从网站提取数据
{
班级计划
{
静态void Main（字符串[]参数）
{
List radioOptions=新列表（）；
WebClient web=新的WebClient（）；
//从某个网站下载html
字符串html=web.DownloadString（“https://docs.google.com/forms/d/1Mout_ImbF9N16EuCiYOxCrL6MbkUVkIEzijO1PAUQ68/viewform?key=pqbhTz7PIHum_4qKEdbUWVg");
MatchCollection m1=Regex.Matches（html，@“查看HtmlAgilityPack。您可以将源代码从webclient响应加载到新的htmldocument中，并从此处非常轻松地遍历它。
尝试使用此Regex作为值提取器：
MatchCollection m1 = Regex.Matches(html, "<input type=\"radio\".+?value=\"(.+?)\".+?\">"
            , RegexOptions.Singleline);

MatchCollection m1=Regex.Matches（html），“建议您阅读本文。