Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/20.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 用正则表达式提取数据_Java_Regex - Fatal编程技术网

Java 用正则表达式提取数据

Java 用正则表达式提取数据,java,regex,Java,Regex,我这里有一个很好的解决方案,但是正则表达式将字符串拆分为“”字符串和我需要的其他两个拆分 String Result = "<ahref=https://blabla.com/Securities_regulation_in_the_United_States>Securities regulation in the United States</a> - Securities regulation in the United States is the field o

我这里有一个很好的解决方案,但是正则表达式将字符串拆分为“”字符串和我需要的其他两个拆分

String  Result = "<ahref=https://blabla.com/Securities_regulation_in_the_United_States>Securities regulation in the United States</a> - Securities regulation in the United States is the field of U.S. law that covers transactions and other dealings with securities.";

String [] Arr =  Result.split("<[^>]*>");
for (String elem : Arr) {
    System.out.printf(elem);
}

Arr[1]
Arr[2]
拆分很好,我就是无法摆脱
Arr[0]

您可以使用一个相反的正则表达式来捕获所需的内容,使用如下正则表达式:

(?s)(?:^|>)(.*?)(?:<|$)

(?s)(?:^ |>)(.*)(:如果只使用
拆分
,则无法避免该空字符串,尤其是因为正则表达式的长度不是零

您可以尝试删除输入开始时放置的第一个匹配项,然后拆分为其他匹配项,如

String[] Arr =  Result.replaceFirst("^<[^>]+>","").split("<[^>]+>")
String[]Arr=Result.replaceFirst(“^]+>”,“”)。拆分(“]+>”)

但是一般来说你应该。比如。

在它不起作用之前就做了,我又做了一次。得到同样的结果result@MattClark是的,你是对的。更改了脚本以匹配字符串而不是拆分字符串,但由于某些原因仍然无效:Arr[0]=“@gb051此正则表达式如何:
(?s)>(*)”(?:尝试解析
foobarbaz
。使用您当前的解决方案,
foo
将被忽略。我如何删除第二句中的“-”?您可以解析结果并在开始时删除每个
-
。您还可以将此
-
添加到拆分的分隔符中,如
拆分(“]+>(\\s*-\\s*)?”
)。
String line = "ahref=https://blabla.com/Securities_regulation_in_the_United_States>Securities regulation in the United States</a> - Securities regulation in the United States is the field of U.S. law that covers transactions and other dealings with securities.";

Pattern pattern = Pattern.compile("(?s)(?:^|>)(.*?)(?:<|$)");
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
    System.out.println("group 1: " + matcher.group(1));
}
String[] Arr =  Result.replaceFirst("^<[^>]+>","").split("<[^>]+>")