C# 如何在C中从字符串中提取url#_C#

C# 如何在C中从字符串中提取url#

C# 如何在C中从字符串中提取url#,c#,C#,我有这个字符串： "<figure><img src='http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg' href='JavaScript:void(0);' onclick='return takeImg(this)' tabindex='1' class='myclass' width='55' height='66'

我有这个字符串：

 "<figure><img
 src='http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg'
 href='JavaScript:void(0);' onclick='return takeImg(this)'
 tabindex='1' class='myclass' width='55' height='66' alt=\"myalt\"></figure>"

所有字符串都是相同的类型，因此我需要在src=和href之间获取子字符串。但我不知道怎么做。谢谢。

您可以使用正则表达式：

var src = Regex.Match("the string", "<img.+?src=[\"'](.+?)[\"'].*?>", RegexOptions.IgnoreCase).Groups[1].Value;

var src=Regex.Match（“字符串“，”，RegexOptions.IgnoreCase）。组[1]。值；

如果您的字符串始终采用相同的格式，您可以这样做：

string input =  "<figure><img src='http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg' href='JavaScript:void(0);' onclick='return takeImg(this)' tabindex='1' class='myclass' width='55' height='66' alt=\"myalt\"></figure>";
// link is between ' signs starting from the first ' sign so you can do :
input = input.Substring(input.IndexOf("'")).Substring(input.IndexOf("'"));
// now your string looks like : "http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg"

return input;

字符串输入=”；
//链接位于“从第一个标志开始的标志”之间，因此您可以执行以下操作：
input=input.Substring（input.IndexOf（“'））。Substring（input.IndexOf（“'））；
//现在您的字符串看起来像：http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg“
返回输入；

一般来说，在解析HTML代码中的值时，应该使用HTML/XML解析器，但对于这样的有限字符串，正则表达式就可以了

string url = Regex.Match(htmlString, @"src='(.*?)'").Groups[1].Value;

string str=”“；
int pFrom=str.IndexOf（“src=”）+“src=”。长度；
int-pTo=str.LastIndexOf（“'href”）；
字符串url=str.Substring（pFrom，pTo-pFrom）；

资料来源：

如果解析HTML，不要使用字符串方法，而是使用真正的HTML解析器，如：

Q是您的字符串在本例中，我查找您想要的属性的索引（src='），然后删除前几个字符（7个包括空格），然后通过查找来查找文本的结尾

删除前几个字符后，可以使用.IndexOf查找要删除的字符数，这样就不会硬编码

        string q =
            "<figure><img src = 'http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg' href = 'JavaScript:void(0);' onclick = 'return takeImg(this)'" +
            "tabindex = '1' class='myclass' width='55' height='66' alt=\"myalt\"></figure>";
        string z = q.Substring(q.IndexOf("src = '"));
        z = z.Substring(7);
        z = z.Substring(0, z.IndexOf("'"));
        MessageBox.Show(z);

字符串q=
"";
字符串z=q.Substring（q.IndexOf（“src=”））；
z=z.子串（7）；
z=z.Substring（0，z.IndexOf（“'））；
MessageBox.Show（z）；

这当然不是最优雅的方式（看看其他答案：））

您可以使用htmlagilitypack，它可以很好地解析html。它通常比与char quote='\''的regexposible副本匹配更稳定；字符串url=（资源字符串+引号）.split（引号）[1]；使用此方法将需要删除打断线@m、 rogalski你这是什么意思？我无法理解，我的意思是，如果您的字符串像所讨论的那样包含换行符，那么这个方法将是无用的。您可以查看我发布的点击链接。这应该在答案中指定。这是否从html页面提取所有url？它是否从给定的

html的链接和图像中提取url
string str = "<figure><imgsrc = 'http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg'href = 'JavaScript:void(0);' onclick = 'return takeImg(this)'tabindex = '1' class='myclass' width='55' height='66' alt=\"myalt\"></figure>";

int pFrom = str.IndexOf("src = '") + "src = '".Length;
int pTo = str.LastIndexOf("'href");

string url = str.Substring(pFrom, pTo - pFrom);

var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);  // html is your string
var linksAndImages = doc.DocumentNode.SelectNodes("//a/@href | //img/@src");
var allSrcList = linksAndImages
    .Select(node => node.GetAttributeValue("src", "[src not found]"))
    .ToList();

        string q =
            "<figure><img src = 'http://myphotos.net/image.ashx?type=2&image=Images\\2\\9\\11\\12\\3\\8\\4\\7\\685621455625.jpg' href = 'JavaScript:void(0);' onclick = 'return takeImg(this)'" +
            "tabindex = '1' class='myclass' width='55' height='66' alt=\"myalt\"></figure>";
        string z = q.Substring(q.IndexOf("src = '"));
        z = z.Substring(7);
        z = z.Substring(0, z.IndexOf("'"));
        MessageBox.Show(z);