C# 我怎样才能找到像regex这样的网站链接_C#_Regex

C# 我怎样才能找到像regex这样的网站链接

c# regex

C# 我怎样才能找到像regex这样的网站链接,c#,regex,C#,Regex,嗨我想要regex选项，可以找到如下网站链接： www.yahoo.com yahoo.com http://www.yahoo.com http://yahoo.com yahoo.jp ( or any domain) http://yahoo.fr 是否有任何方法可以使用正则表达式来跟踪它们？这个来自daringfireball.net的正则表达式应该能够满足您的大多数需求。我不确定域.tld，因为这是非常模糊的 (?xi) \b (

嗨

我想要regex选项，可以找到如下网站链接：

www.yahoo.com
yahoo.com
http://www.yahoo.com
http://yahoo.com
yahoo.jp ( or any domain)
http://yahoo.fr

是否有任何方法可以使用正则表达式来跟踪它们？

这个来自daringfireball.net的正则表达式应该能够满足您的大多数需求。我不确定

域.tld

，因为这是非常模糊的

(?xi)
\b
(                           # Capture 1: entire matched URL
  (?:
    [a-z][\w-]+:                # URL protocol and colon
    (?:
      /{1,3}                        # 1-3 slashes
      |                             #   or
      [a-z0-9%]                     # Single letter or digit or '%'
                                    # (Trying not to match e.g. "URI::Escape")
    )
    |                           #   or
    www\d{0,3}[.]               # "www.", "www1.", "www2." … "www999."
    |                           #   or
    [a-z0-9.\-]+[.][a-z]{2,4}/  # looks like domain name followed by a slash
  )
  (?:                           # One or more:
    [^\s()<>]+                      # Run of non-space, non-()<>
    |                               #   or
    \(([^\s()<>]+|(\([^\s()<>]+\)))*\)  # balanced parens, up to 2 levels
  )+
  (?:                           # End with:
    \(([^\s()<>]+|(\([^\s()<>]+\)))*\)  # balanced parens, up to 2 levels
    |                                   #   or
    [^\s`!()\[\]{};:'".,<>?«»“”‘’]        # not a space or one of these punct chars
  )
)

（？xi）
\b
（#捕获1：完整匹配的URL
(?:
[a-z][\w-]+：#URL协议和冒号
(?:
/{1,3}#1-3斜杠
|#或
[a-z0-9%]#单个字母或数字或“%”
#（尝试不匹配，例如“URI:：Escape”）
)
|#或
www\d{0,3}[.]#“www.”，“wwww1.”，“www2.”…“www999.”
|#或
[a-z0-9.\-]+[.][a-z]{2,4}/#看起来像是域名后跟斜杠
)
（？：#一个或多个：
[^\s（）]+#非空格运行，非-（）
|#或
\（（[^\s（）]+\（\（[^\s（）]+\）*\）\平衡排列，最多两个级别
)+
（？：#结尾为：
\（（[^\s（）]+\（\（[^\s（）]+\）*\）\平衡排列，最多两个级别
|#或
[^\s`！（）\[\]{}；：“，«»”'''.\n不是空格或这些点状字符之一
)
)

有关它的更多详细信息，请查看

我将在这里抛出一个替代方案，而不是正则表达式。请查看，您的案例如下所示：

var doc = new HtmlDocument();
doc.Load("file.htm");
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[contains(@href, 'yahoo')]"])
{
  var href = link["href"];
  //href is a url that contains the word `yahoo`, do something with it
}

它并没有像你写的那样真正回答这个问题，只是让你的选项保持开放，就像。

我用过它，wand works find。但是有一个小问题，我怎么才能找到返回的文本？我用了MatchCollection mc18=Regex.Matches（text，regexOption，RegexOptions.IgnoreCase）；我应该知道什么才能找到文本？你是想替换这些事件还是只是想找到它们？还有一个问题，如果链接是在{}比如{www.yahooo.com}或{www.yahooo.com}之间，我如何跟踪