Java 使用正则表达式从Grafana表达式检索普罗米修斯度量名称_Java_Regex_Grafana_Prometheus_Regex Group

Java 使用正则表达式从Grafana表达式检索普罗米修斯度量名称

java regex grafana prometheus

Java 使用正则表达式从Grafana表达式检索普罗米修斯度量名称,java,regex,grafana,prometheus,regex-group,Java,Regex,Grafana,Prometheus,Regex Group,我尝试了许多不同的regex模式来获得它，但并不十分成功这个问题的模式是： <method_name(> metric_name <{filter_condition}> <[time_duration]> <)> <by (some members)> ^------------------------------------------------------^

我尝试了许多不同的

regex

模式来获得它，但并不十分成功

这个问题的模式是：

<method_name(> metric_name <{filter_condition}> <[time_duration]> <)> <by (some members)>
            ^------------------------------------------------------^
                          method_name(...) can be multiple

下面是我在Java中尝试的一个演示。但是我似乎无法获得正确的

模式

来检索我上面提到的模式的确切答案

如果可能的话，请分享一些想法

public static void main(String... args) {
    String[] exprs = {"sum(log_query_task_cache_hit_rate_bucket)by(le)",
            "sum(log_search_by_service_total {service_name!~\"\"}) by (service_name, operator)",
            "log_request_total",
            " sum(delta(log_request_total[5m])) by (args, user_id)",
            "log_request_total{methodName=~\"getAppDynamicsGraphMetrics|getAppDynamicsMetrics\"}",
            "sum(delta(log_request_total{className=~\".*ProductDashboardController\",methodName=~\"getDashboardConfig|updateMaintainers|addQuickLink|deleteQuickLink|addDependentMiddleware|addDependentService|updateErrorThreshold\"}[5m])) by (user_id)",
            "sum(log_request_total{methodName=\"getInstanceNames\"}) by (user_id)",
            "sum(log_request_total{methodName=\"getVpcCardInfo\",user_id!~\"${user}\"}) by (envName)",
            "count_scalar(sum(log_query_request_total) by (user_id))",
            "avg(log_waiting_time_average) by (exported_tenant, exported_landscape)",
            "avg(task_processing_time_average{app=\"athena\"})",
            "avg(log_queue_time_average) by (log_type)",
            "sum(delta(product_dashboard_service_sum[2m]))",
            "ceil(delta(product_dashboard_service_count[5m]))]"
    };
    String[] expected = {
            "log_query_task_cache_hit_rate_bucket",
            "log_search_by_service_total",
            "log_request_total",
            "log_request_total",
            "log_request_total",
            "log_request_total",
            "log_request_total",
            "log_request_total",
            "log_query_request_total",
            "log_waiting_time_average",
            "task_processing_time_average",
            "log_queue_time_average",
            "product_dashboard_service_sum",
            "product_dashboard_service_count"
    };
    Pattern pattern = Pattern.compile(".*?\\(?([\\w|_]+)\\{?\\[?.*");
    testPattern(exprs, expected, pattern);
    pattern = Pattern.compile(".*\\(?([\\w|_]+)\\{?\\[?.*");
    testPattern(exprs, expected, pattern);
    pattern = Pattern.compile(".*?\\(?([\\w|_]+)\\{?\\[?.*");
    testPattern(exprs, expected, pattern);
}

private static void testPattern(String[] exprs, String[] expected, Pattern pattern) {
    System.out.println("\n********** Pattern Match Test *********\n");
    for (int i = 0; i < exprs.length; ++i) {
        String expr = exprs[i];
        Matcher matcher = pattern.matcher(expr);
        if (matcher.find()) {
            System.out.println("\nThe Original Expr: " + expr);
            System.out.println(String.format("Expected:\t %-40s Matched:\t %-40s", expected[i], matcher.group(1)));
        } else {
            System.out.println("expected: " + expected[i] + " not matched");
        }
    }
}

但以下一种甚至更复杂的组合让我转向可靠的方法：

Case # 8
input: sum(hue_mail_sent_attachment_bytes_total) by (app)  / sum(hue_mail_sent_mails_with_attachment_total) by (app)
Expected: [hue_mail_sent_attachment_bytes_total, hue_mail_sent_mails_with_attachment_total]

现在要复杂得多…甚至是不可预测的，因为无法控制用户输入的

expr

因此，我通过更可靠、更简单的解决方案实现了相同的目标：

首先将

不同的度量名称存储到数据库中
当expr
出现时，使用contains（字符串s）
在内存中检查它
但也可能存在一个问题：如果某些度量名称包含其他度量名称，则过度匹配
此正则表达式捕获组1中的目标
^(?:\w+\()*(\w+)

看
在java中，要获得目标：
String metricName = input.replaceAll("^(?:\\w+\\()*(\\w+)", "$1");

此正则表达式捕获组1中的目标
^(?:\w+\()*(\w+)

看
在java中，要获得目标：
String metricName = input.replaceAll("^(?:\\w+\\()*(\\w+)", "$1");

这对于正则表达式来说太复杂了。对于这样复杂的东西，你可能需要一个lexer和一个AST生成器。这对于RegEx来说太复杂了。对于这样复杂的东西，你可能需要一个lexer和一个AST生成器。哇，非常感谢你，@Bohemian。它是如此简洁和干净<代码>^（？:\w+\（）*
将匹配所有的方法名称（
重复，并且在重复之后（零或更多）它就在那里=>我正在搜索的度量名称。
很好的解决方案；）但我仍然需要对额外的空格进行一些小的修改，以确保它始终有效。感谢您的帮助~@Bohemian^（不过，我需要对额外的空格进行一些小的修改，以确保它始终有效。感谢您的帮助~@Bohemian.^（？：\\s*\\w+\\s*\（\\s*）*\\s*（\w+）