C# 如何从循环中的字符串中获取所有文本,而不仅仅是一次?
我有以下代码:C# 如何从循环中的字符串中获取所有文本,而不仅仅是一次?,c#,C#,我有以下代码: private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e) { BackgroundWorker worker = sender as BackgroundWorker; WebRequest request = WebRequest.Create(url); request.Method = "GET";
private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
{
BackgroundWorker worker = sender as BackgroundWorker;
WebRequest request = WebRequest.Create(url);
request.Method = "GET";
WebResponse response = request.GetResponse();
Stream stream = response.GetResponseStream();
StreamReader reader = new StreamReader(stream);
string content = reader.ReadToEnd();
reader.Close();
response.Close();
}
现在我有两个功能:
private void GetProfileNames(string text)
{
string startTag = "<a href='/profile/";
string endTag = "'>";
int startTagWidth = startTag.Length;
int endTagWidth = endTag.Length;
index = 0;
while (true)
{
index = text.IndexOf(startTag, index);
if (index == -1)
{
break;
}
// else more to do - index now is positioned at first character of startTag
int start = index + startTagWidth;
index = text.IndexOf(endTag, start + 1);
if (index == -1)
{
break;
}
// found the endTag
profileName = text.Substring(start, index - start);
}
return profileName;
}
private void GetTextFromProfile(string text)
{
string str = "<span class=\"message-text\">";
string startTag = str;
string endTag = "<";
int startTagWidth = startTag.Length;
int endTagWidth = endTag.Length;
index = 0;
while (true)
{
index = text.IndexOf(startTag, index);
if (index == -1)
{
break;
}
// else more to do - index now is positioned at first character of startTag
int start = index + startTagWidth;
index = text.IndexOf(endTag, start + 1);
if (index == -1)
{
break;
}
// found the endTag
profileNameText = text.Substring(start, index - start);
}
return profileNameText;
}
private void GetProfileNames(字符串文本)
{
string startTag=“:hello world
因此,我需要这两个函数在内容上循环,每次有itertion时,我都会得到一个字符串,如string t=“LipazD hello world”
下一篇文章是:“丹尼尔,你好吗?”
函数可以工作,它们得到配置文件名,第二个得到文本,但我不知道如何使itertion循环并使其全部工作
然后,当它完成循环的内容,并获得所有的配置文件名称和每个配置文件名称的文本,我需要删除的内容,并再次下载一个新的内容,然后再做一次与功能完成删除内容或只是下载一个新的内容等一遍又一遍
HtmlDocument doc = new HtmlDocument();
WebClient wc = new WebClient();
doc.Load(wc.DownloadString("http://yourUri.com"));
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//span[@class='message-profile-name'"])
{
// etc.
}
但我认为消息配置文件名和消息文本被包装在父元素中。我建议在该元素上循环,然后获取子配置文件名和注释span content我不明白您的问题来自何处,但您不能看看解析HTML的方法吗?正如@CodeCaster所说:我还建议使用HTML Agility Pack并将其与
WebClient.DownloadString
方法相结合:我同意@CodeCaster。手动解析HTML/XML几乎从来都不是一个好主意。有很多强大而有效的lib可用于此。我不知道如何使用HTML aility Pack多次阅读,我也不知道如何将其与我的代码一起解析内容和内容重新构建它。我不理解html敏捷包中的所有符号、节点和内容。我想使用html敏捷包应该很容易。
var wc = new WebClient();
wc.DownloadStringCompleted += (s, e) =>
{
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(e.Result);
var link = doc.DocumentNode
.SelectSingleNode("//span[@class='message-profile-name']")
.Element("a")
.Attributes["href"].Value;
};
wc.DownloadStringAsync(new Uri("http://chatroll.com/rotternet"));
var wc = new WebClient();
wc.DownloadStringCompleted += (s, e) =>
{
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(e.Result);
var link = doc.DocumentNode
.SelectSingleNode("//span[@class='message-profile-name']")
.Element("a")
.Attributes["href"].Value;
};
wc.DownloadStringAsync(new Uri("http://chatroll.com/rotternet"));