Java 如何在包含单个点或句点的文件中获取单词?

Java 如何在包含单个点或句点的文件中获取单词?,java,regex,java.util.scanner,Java,Regex,Java.util.scanner,我想从下面名为“query”的文件中查找表名和列名 var query = " SELECT accounts.name, SUM((COALESCE((jan_val_c),0)+ "; query += " COALESCE((feb_val_c),0)+ COALESCE((march_val_c),0)+ COALESCE((apr_val_c),0)+ "; query += " COALESCE((may_val_c),0)+ COALESCE((june_val_c),0)

我想从下面名为“query”的文件中查找表名和列名

var query = "  SELECT accounts.name, SUM((COALESCE((jan_val_c),0)+  ";
query += "  COALESCE((feb_val_c),0)+ COALESCE((march_val_c),0)+ COALESCE((apr_val_c),0)+ ";
query += "  COALESCE((may_val_c),0)+ COALESCE((june_val_c),0)+ COALESCE((july_val_c),0)+   ";
query += "  COALESCE((aug_val_c),0)+ COALESCE((sept_val_c),0)+ COALESCE((oct_val_c),0)+   ";
query += "  COALESCE((nov_val_c),0)+ COALESCE((dec_val_c),0))) AS sales_plan,SUM((COALESCE((jan_actual_val_c),0)+   ";
query += "  COALESCE( (feb_actual_val_c),0)+ COALESCE( (march_actual_val_c),0)+ COALESCE( (apr_actual_val_c),0)+   ";
query += "  COALESCE( (may_actual_val_c),0)+ COALESCE( (june_actual_val_c),0)+ COALESCE( (july_actual_val_c),0)+   ";
query += "  COALESCE( (aug_actual_val_c),0)+ COALESCE( (sept_actual_val_c),0)+ COALESCE( (oct_actual_val_c),0)+   ";
query += "  COALESCE( (nov_actual_val_c),0)+ COALESCE( (dec_actual_val_c),0))) AS Actual_plan ,month_name_c,  ";
query += "   cl_sales_planning_month.year_c, cl_products.volume,cl_brands.name AS brand ,cl_therapies.name   ";
query += "   AS therapy,cl_products.name AS product, accounts.created_by,accounts.assigned_user_id ,   ";
query += "   DATE_FORMAT(STR_TO_DATE(CONCAT_WS('-',cl_sales_planning_month.month_name_c,  ";
query += "   cl_sales_planning_month.year_c),'%M-%Y'),'%b-%y' ) AS monthyear FROM cl_sales_planning_month   ";
query += "   LEFT JOIN accounts ON cl_sales_planning_month.account_id_c =accounts.id LEFT JOIN cl_products   ";
query += "   ON cl_sales_planning_month.cl_products_id_c = cl_products.id LEFT JOIN cl_brands ON   ";
query += "   cl_products.cl_brands_id_c=cl_brands.id LEFT JOIN cl_therapies ON   ";
query += "   cl_products.cl_therapies_id_c=cl_therapies.id WHERE   ";
 query += "            cl_sales_planning_month.month_name_c = MONTHNAME(CURRENT_DATE - INTERVAL 2 MONTH) AND  ";
      query += "            cl_sales_planning_month.year_c = YEAR(CURRENT_DATE - INTERVAL 2 MONTH)  AND";

query += "   cl_sales_planning_month.user_id_c IN ("+ params["childs"].value +") ";
query += "   GROUP BY therapy,monthyear   ";
query += "   ORDER BY STR_TO_DATE(cl_sales_planning_month.year_c,'%Y') ASC,   ";
query += "  STR_TO_DATE(cl_sales_planning_month.month_name_c,'%M') ASC, Actual_plan DESC   "; 
为此,我编写了一个Java程序:

package com.waprau;

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.regex.Pattern;

public class SeparateTableNamesColumnNames {
    public static void main(String[] args) {
        File file = new File("/home/waprau/Desktop/query");
        //Pattern = new Pattern("([^\\s]+(\\.(?i))$)");

        try {
            Scanner scanner = new Scanner(file);
            scanner.useDelimiter("\\s|=|,|\\)|\\(|this.|\\].");

            while(scanner.hasNext()){
                if(scanner.next().matches("(?<!\\.)\\b[a-zA-Z]\\w*\\.[a-zA-Z]\\w*\\b(?!\\.)"))
                 System.out.println(scanner.next());;
               }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }
}
package com.waprau;
导入java.io.File;
导入java.io.FileNotFoundException;
导入java.util.Scanner;
导入java.util.regex.Pattern;
公共类SeparateTableNamesColumnNames{
公共静态void main(字符串[]args){
File File=新文件(“/home/waprau/Desktop/query”);
//模式=新模式(([^\\s]+(\\.(?i))$);
试一试{
扫描仪=新扫描仪(文件);
scanner.useDelimiter(“\\s |=|,|\\)|\\(| this.|\\])”;
while(scanner.hasNext()){

如果(scanner.next()匹配(“(?)以匹配包含点的单词,则可以使用:
“\\w+\\.\\w+”

\w
匹配字母、数字和下划线

但是,这也会匹配具有多个句点的词。您可以通过使用环顾四周来改进它,以确保在匹配的词之前或之后没有其他句点:

"(?<!\\.)\\b\\w+\\.\\w+\\b(?!\\.)"

句点
必须转义,因为它表示“任何字符”。由于这不是正常的字符串转义(如
\n
),它使用两个反斜杠:
\\.

另外
\\s

不管正则表达式是什么(dan1111的答案似乎涵盖了这一点)。您的Java代码中有一个缺陷,scanner.next()获取下一个字符串,由于您调用了两次,所以您不会打印匹配的内容。相反,您将在每次匹配后打印项目

如果您按以下方式更改循环,它似乎会打印您想要的内容:

String tmp;
while (scanner.hasNext()) {
    // Store next item so we can match AND print it.
    tmp = scanner.next();
    if (tmp.matches("(?<!\\.)\\b[a-zA-Z]\\w*\\.[a-zA-Z]\\w*\\b(?!\\.)"))
        System.out.println(tmp);
}
字符串tmp;
while(scanner.hasNext()){
//存储下一个项目,以便我们可以匹配和打印它。
tmp=scanner.next();

if(tmp.matches((?@waprau),您可能还需要为每个反斜杠添加双转义。请参阅Joop Eggen的答案。对不起,我知道正则表达式,但不知道Java。请查找我的问题,并使用双转义编辑。但仍然没有成功。:-(@waprau,请解释什么“不起作用”意思是。没有匹配项?错误匹配项?错误消息?@dan1111。我编辑了我的问题,得到了结果。谢谢。@waprau,问题不在于正则表达式,而在于你对扫描仪的使用。每次你调用
扫描仪。next
它会进入下一项。因此,你正在打印的
扫描仪。next
总是在t之后的下一项正则表达式匹配,而不是匹配的。将
扫描仪的结果放入一个变量中,然后使用该变量检查正则表达式并打印匹配的结果。点击!感谢rvalvik。似乎它正在工作。让我尝试使用更多的文件&如果有任何问题,我将发布。再次感谢。
String tmp;
while (scanner.hasNext()) {
    // Store next item so we can match AND print it.
    tmp = scanner.next();
    if (tmp.matches("(?<!\\.)\\b[a-zA-Z]\\w*\\.[a-zA-Z]\\w*\\b(?!\\.)"))
        System.out.println(tmp);
}