Java html标记中的文本，提供带有属性的标记名_Java_Regex

Java html标记中的文本，提供带有属性的标记名

java regex

Java html标记中的文本，提供带有属性的标记名,java,regex,Java,Regex,我有一根像这样的绳子- <h3 class="media__title"> <a class="media__link" href="/news/world-europe41644527" rev="video|headline"> The equestrian champion with no legs </a> </h3

我有一根像这样的绳子-

  <h3 class="media__title"> 
  <a class="media__link" href="/news/world-europe41644527" rev="video|headline">
  The equestrian champion with no legs                                                         
  </a> </h3>

但仍然没有进展。有人能告诉我在这个正则表达式模式中应该更改什么吗？

试试这个：

String regex = <h3 (.*)>((.|\s)+?)<\/h3>

String regex=（（.|\s）+？）

您的方法的主要问题是。字符与行终止符不匹配

解释：

<h3 (.*)> matches an opening h3 tag together with all attributes contained (you could also use different patterns if you are interested in the attributes themselfs)

((.|\s)+?) match everything inside the h3 tag (.|s) means everything ("everything but line terminators or whitesaces")

<\/h3> the closing h3 tag (escaped because / is a regex delimiter)

将开始的h3标记与包含的所有属性一起匹配（如果您对属性本身感兴趣，也可以使用不同的模式）
（（.|\s）+？）匹配h3标记内的所有内容（.| s）表示所有内容（“除行终止符或空白外的所有内容”）
结束h3标记（因/是正则表达式分隔符而转义）

请记住，现在您要查找的组是第二组，而不是第一组

如何向此正则表达式模式提供html的属性和值。我想使用具有

class=“media\uu title”

属性的h3标签。谢谢像这样的事？（（.|\s）+？）链接：，或者如果您希望匹配所有具有class=“media|u title”和/或其他属性的h3，请尝试以下操作：（.|\s）+？）

String regex = <h3 class=\"medial__title\">(.+?)</h3>

String regex = <h3 (.*)>((.|\s)+?)<\/h3>

<h3 (.*)> matches an opening h3 tag together with all attributes contained (you could also use different patterns if you are interested in the attributes themselfs)

((.|\s)+?) match everything inside the h3 tag (.|s) means everything ("everything but line terminators or whitesaces")

<\/h3> the closing h3 tag (escaped because / is a regex delimiter)