Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/20.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用正则表达式java将字符串拆分为句子,并在末尾保存标点符号_Java_Regex_String - Fatal编程技术网

使用正则表达式java将字符串拆分为句子,并在末尾保存标点符号

使用正则表达式java将字符串拆分为句子,并在末尾保存标点符号,java,regex,string,Java,Regex,String,所以,我有这样的文本 String s = "The if-then-else statement provides a secondary path of execution when an "if" clause evaluates to false. You could use an if-then-else statement in the applyBrakes method to take some action if the brakes are applied when the

所以,我有这样的文本

String s = "The if-then-else statement provides a secondary path of execution when an "if" clause evaluates to false. You could use an if-then-else statement in the applyBrakes method to take some action if the brakes are applied when the bicycle is not in motion. In this case, the action is to simply print an error message stating that the bicycle has already stopped."
我需要将此字符串拆分为几个句子,但在句子末尾保存标点符号,因此我不能仅使用以下内容:

s.split("[\\.|!|\\?|:] ");
因为如果我使用它,我会收到:

The if-then statement is the most basic of all the control flow statements
It tells your program to execute a certain section of code only if a particular test evaluates to true
For example, the Bicycle class could allow the brakes to decrease the bicycle's speed only if the bicycle is already in motion
One possible implementation of the applyBrakes method could be as follows:

我在末尾失去了标点符号,那么我该怎么做呢?

您只需在模式中的输入末尾替换一个空格:

//                                          | your original punctuation class, 
//                                          | no need for "|" between items
//                                          | (that would include "|" 
//                                          |  as a delimiter)
//                                          | nor escapes, now that I think of it
//                                          |         | look ahead for:
//                                          |         | either whitespace
//                                          |         |     | or end
System.out.println(Arrays.toString(s.split("[.!?:](?=\\s|$)")));
这将包括最后一个区块,并为澄清添加了打印换行符:

[The if-then-else statement provides a secondary path of execution when an "if" clause evaluates to false,  
You could use an if-then-else statement in the applyBrakes method to take some action if the brakes are applied when the bicycle is not in motion,  
In this case, the action is to simply print an error message stating that the bicycle has already stopped]

您只需在模式中用输入结尾替换空白:

//                                          | your original punctuation class, 
//                                          | no need for "|" between items
//                                          | (that would include "|" 
//                                          |  as a delimiter)
//                                          | nor escapes, now that I think of it
//                                          |         | look ahead for:
//                                          |         | either whitespace
//                                          |         |     | or end
System.out.println(Arrays.toString(s.split("[.!?:](?=\\s|$)")));
这将包括最后一个区块,并为澄清添加了打印换行符:

[The if-then-else statement provides a secondary path of execution when an "if" clause evaluates to false,  
You could use an if-then-else statement in the applyBrakes method to take some action if the brakes are applied when the bicycle is not in motion,  
In this case, the action is to simply print an error message stating that the bicycle has already stopped]
首先,您的正则表达式[\.\124;!\ 124\\?\ 124;:]表示。或者|或者!或者|或者?或者|或者:因为你使用了[…]。您可能想使用\.\!\ \?\:或者可能更好我不知道你为什么想要:在这里,但这是你的选择

下一件事是,如果你想在空间上分割,请确保。或或或者:字符在前面,但不使用前面的字符使用机制,如

split("(?<=[.!?:])\\s")
但最好的方法是使用合适的工具来拆分句子,这就是BreakIterator。您可以在这个问题中找到用法示例:

首先,您的正则表达式[\.\.\124;!\ 124\\?\ 124;:]表示。或者|或者!或者|或者?或者|或者:因为你使用了[…]。您可能想使用\.\!\ \?\:或者可能更好我不知道你为什么想要:在这里,但这是你的选择

下一件事是,如果你想在空间上分割,请确保。或或或者:字符在前面,但不使用前面的字符使用机制,如

split("(?<=[.!?:])\\s")

但最好的方法是使用合适的工具来拆分句子,这就是BreakIterator。您可以在这个问题中找到用法示例:

上面的文本和下面的文本是不同的,超出了一个坏正则表达式所能解释的范围。你粘贴文件的错误部分了吗?另外,我不认为应该编译top表达式-嵌入的引号使表达式字符串s=x if y,我认为这不是有效的java。上面的文本和下面的文本是不同的,超过了一个坏的正则表达式所能解释的。你粘贴文件的错误部分了吗?另外,我不认为应该编译顶级表达式-嵌入的引号使表达式字符串s=x if y,我认为这不是有效的java。