C# “我怎样才能逃脱?”;?
我需要得到这个部分:C# “我怎样才能逃脱?”;?,c#,.net,winforms,html-agility-pack,C#,.net,Winforms,Html Agility Pack,我需要得到这个部分: <a href="/setprefs?suggon=2&prev=https://www.test.com/search?q%3D%2Band%2B%26espv%3D2%26biw%3D960%26bih%3D489%26source%3Dlnms%26tbm%3Disch%26sa%3DX%26ei%3DYrxxVb-hJqac7gba0YOgDQ%26ved%3D0CAYQ_AUoAQ&sig=0_seDQVVTDQQx1hvN3
<a href="/setprefs?suggon=2&prev=https://www.test.com/search?q%3D%2Band%2B%26espv%3D2%26biw%3D960%26bih%3D489%26source%3Dlnms%26tbm%3Disch%26sa%3DX%26ei%3DYrxxVb-hJqac7gba0YOgDQ%26ved%3D0CAYQ_AUoAQ&sig=0_seDQVVTDQQx1hvN3BRktZNFc9Ew%3D" style="left:-1000em;position:absolute">Screen-reader users, click here to turn off ggg Instant.</a>
中间的部分
我还尝试使用htmlagilitypack:
/setprefs?suggon=2&prev=https://www.test.com/search?q%3D%2Band%2B%26espv%3D2%26biw%3D960%26bih%3D489%26source%3Dlnms%26tbm%3Disch%26sa%3DX%26ei%3DYrxxVb-hJqac7gba0YOgDQ%26ved%3D0CAYQ_AUoAQ&sig=0_seDQVVTDQQx1hvN3BRktZNFc9Ew%3D
但这只给了我一个链接。
当我浏览并查看页面视图源代码时,我用单词image或images进行搜索和过滤,我得到了350多个结果
我也尝试过这个解决方案:
HtmlAgilityPack.HtmlWeb hw = new HtmlAgilityPack.HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = hw.Load("https://www.test.com");
foreach (HtmlAgilityPack.HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]"))
{
string hrefValue = link.GetAttributeValue("href", string.Empty);
if (!newHtmls.Contains(hrefValue) && hrefValue.Contains("images"))
newHtmls.Add(hrefValue);
}
但它并没有给我所需要的结果
忘了提到页面内容的查看源,我将其复制到richTextBox1窗口,然后我逐行阅读richTextBox1中的文本,所以可能这就是为什么我无法获得所需结果的原因
var document = new HtmlWeb().Load(url);
var urls = document.DocumentNode.Descendants("img")
.Select(e => e.GetAttributeValue("src", null))
.Where(s => !String.IsNullOrEmpty(s));
for(int i=0;i 如果(richTextBox1.Lines[i].StartsWith(“基于您的输入,EndsWith
没有帮助(因为您的输入实际上以
结束)。您的下一个最佳选择是存储href=“
”的位置(position),然后从您存储的位置开始查找下一次出现的”
for (int i = 0; i < richTextBox1.Lines.Length; i++)
{
if (richTextBox1.Lines[i].StartsWith("<a href=\"") &&
richTextBox1.Lines[i].EndsWith("\""))
{
listBox1.Items.Add(richTextBox1.Lines[i]);
}
}
var输入=@”;
var针=@“href=”“”;
var start=输入索引(指针);
如果(开始!=-1)
{
开始+=针的长度;
var end=input.IndexOf(@“”,start);
//最终结果:
var href=input.Substring(start,end-start.Dump();
}
比这更好的是使用一个实际的HTML解析器(我可以推荐HtmlAgilityPack吗?)。应该是这样的……请显示一个输入值。它不是以“>
结尾吗?正如Daniel所说,需要输入值吗.BTW,对于此类内容,您确实应该查看HtmlAgilityPack。如果需要两个表达式都为true,那么它也应该是&&而不是| |
for (int i = 0; i < richTextBox1.Lines.Length; i++)
{
if (richTextBox1.Lines[i].StartsWith("<a href=\"") &&
richTextBox1.Lines[i].EndsWith("\""))
{
listBox1.Items.Add(richTextBox1.Lines[i]);
}
}
var input = @"<a href=""/setprefs?suggon=2&prev=https://www.test.com/search?q%3D%2Band%2B%26espv%3D2%26biw%3D960%26bih%3D489%26source%3Dlnms%26tbm%3Disch%26sa%3DX%26ei%3DYrxxVb-hJqac7gba0YOgDQ%26ved%3D0CAYQ_AUoAQ&sig=0_seDQVVTDQQx1hvN3BRktZNFc9Ew%3D"" style=""left:-1000em;position:absolute"">Screen-reader users, click here to turn off ggg Instant.</a>";
var needle = @"href=""";
var start = input.IndexOf(needle);
if (start != -1)
{
start += needle.Length;
var end = input.IndexOf(@"""", start);
// final result:
var href = input.Substring(start, end - start).Dump();
}