Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/visual-studio-2008/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
java中包含项目符号列表的正则表达式句子分析_Java_Regex - Fatal编程技术网

java中包含项目符号列表的正则表达式句子分析

java中包含项目符号列表的正则表达式句子分析,java,regex,Java,Regex,目前,我使用以下正则表达式来解析文档中的句子: Pattern.compile("(?<=\\w[\\w\\)\\]](?<!Mrs?|Dr|Rev|Mr|Ms|vs|abd|ABD|Abd|resp|St|wt)[\\.\\?\\!\\:\\@]\\s)"); 然而,我想说的是: Mary had a little lamb (i.e. lamby pie). Here are its properties: 1. It has four feet 2. It has fl

目前,我使用以下正则表达式来解析文档中的句子:

Pattern.compile("(?<=\\w[\\w\\)\\]](?<!Mrs?|Dr|Rev|Mr|Ms|vs|abd|ABD|Abd|resp|St|wt)[\\.\\?\\!\\:\\@]\\s)");
然而,我想说的是:

Mary had a little lamb (i.e. lamby pie).
Here are its properties: 
1. It has four feet  
2. It has fleece 
3. It is a mammal. 
It had white fleese. 
Her father, Mr. Lamb, lives on Mulbery St. in a little white house.
是否可以通过修改现有的正则表达式来实现这一点

现在要完成这项任务,我首先进行初始拆分,然后检查子弹。以下代码可以工作,但我想知道是否有更优雅的解决方案:

public static void doHomeMadeSentenceParser(String temp) {
    Pattern p = Pattern
            .compile("(?<=\\w[\\w\\)\\]](?<!Mrs?|Dr|Rev|Mr|Ms|vs|abd|ABD|Abd|resp|St|wt)[\\.\\?\\!\\:\\@]\\s)");
    String[] sentences = p.split(temp);
    Vector psentences = new Vector();
    Pattern p1 = Pattern.compile("\\b\\d+[.)]\\s");
    for (int x = 0; x < sentences.length; x++) {
        Matcher matcher = p1.matcher(sentences[x]);
        int bstart = 0;
        boolean bulletfound = false;
        while (matcher.find()) {
            bulletfound = true;
            String bullet = sentences[x].substring(bstart, matcher.start());
            if (bullet.length() > 0) {
                psentences.add(bullet);
            }
            bstart = matcher.start();
        }
        if (bulletfound)
            psentences.add(sentences[x].substring(bstart));
        else
            psentences.add(sentences[x]);
    }
    for (int x = 0; x < psentences.size(); x++) {
        String s = (String) psentences.get(x);
        System.out.println(s.trim());
    }
}
publicstaticvoiddohomedesentenceparser(字符串临时值){
模式p=模式

.compile((?我假设您正在使用正则表达式来查找拆分行的位置。我不知道用于此目的的正则表达式,但您能否在前面查找一个后跟句点(.)的数字?

我假设您正在使用正则表达式来找到分线的位置。我不知道这方面的正则表达式,但您可以向前看一个后跟句号(.)的数字吗?

我想您的正则表达式在翻译中被破坏了。您能穿上它吗?我想您的正则表达式在翻译中被破坏了。您能穿上它吗?
public static void doHomeMadeSentenceParser(String temp) {
    Pattern p = Pattern
            .compile("(?<=\\w[\\w\\)\\]](?<!Mrs?|Dr|Rev|Mr|Ms|vs|abd|ABD|Abd|resp|St|wt)[\\.\\?\\!\\:\\@]\\s)");
    String[] sentences = p.split(temp);
    Vector psentences = new Vector();
    Pattern p1 = Pattern.compile("\\b\\d+[.)]\\s");
    for (int x = 0; x < sentences.length; x++) {
        Matcher matcher = p1.matcher(sentences[x]);
        int bstart = 0;
        boolean bulletfound = false;
        while (matcher.find()) {
            bulletfound = true;
            String bullet = sentences[x].substring(bstart, matcher.start());
            if (bullet.length() > 0) {
                psentences.add(bullet);
            }
            bstart = matcher.start();
        }
        if (bulletfound)
            psentences.add(sentences[x].substring(bstart));
        else
            psentences.add(sentences[x]);
    }
    for (int x = 0; x < psentences.size(); x++) {
        String s = (String) psentences.get(x);
        System.out.println(s.trim());
    }
}