Kubernetes 普罗米修斯联盟的Grafana仪表板设置
我正在使用prometheus federation从多个k8s集群中获取度量。它工作正常,我想在grafana上创建一些仪表板,我想按租户(集群)筛选仪表板。。我试图使用变量,但我不明白的是,即使ı没有为Kubernetes 普罗米修斯联盟的Grafana仪表板设置,kubernetes,monitoring,prometheus,grafana,grafana-variable,Kubernetes,Monitoring,Prometheus,Grafana,Grafana Variable,我正在使用prometheus federation从多个k8s集群中获取度量。它工作正常,我想在grafana上创建一些仪表板,我想按租户(集群)筛选仪表板。。我试图使用变量,但我不明白的是,即使ı没有为kube\u pod\u container\u status\u restars\u total,它也包含我在静态配置下指定的标签,但kube\u node\u spec\u unschedulable不是 那么,这种差异来自何处,我应该怎么做?同时,设置仪表板的最佳实践方法是什么,它可以通
kube\u pod\u container\u status\u restars\u total
,它也包含我在静态配置下指定的标签,但kube\u node\u spec\u unschedulable
不是
那么,这种差异来自何处,我应该怎么做?同时,设置仪表板的最佳实践方法是什么,它可以通过多个集群名称提供仪表板过滤器?ı应该使用relabel吗
kube_pod_container_status_restarts_total{app="kube-state-metrics",container="backup",....,tenant="022"}
普罗米修斯服务器
prometheus.yml:
rule_files:
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
scrape_configs:
- job_name: federation_012
scrape_interval: 5m
scrape_timeout: 1m
honor_labels: true
honor_timestamps: true
metrics_path: /prometheus/federate
params:
'match[]':
- '{job!=""}'
scheme: https
static_configs:
- targets:
- host
labels:
tenant: 012
tls_config:
insecure_skip_verify: true
- job_name: federation_022
scrape_interval: 5m
scrape_timeout: 1m
honor_labels: true
honor_timestamps: true
metrics_path: /prometheus/federate
params:
'match[]':
- '{job!=""}'
scheme: https
static_configs:
- targets:
- host
labels:
tenant: 022
tls_config:
insecure_skip_verify: true
中央集群
prometheus.yml:
rule_files:
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
scrape_configs:
- job_name: federation_012
scrape_interval: 5m
scrape_timeout: 1m
honor_labels: true
honor_timestamps: true
metrics_path: /prometheus/federate
params:
'match[]':
- '{job!=""}'
scheme: https
static_configs:
- targets:
- host
labels:
tenant: 012
tls_config:
insecure_skip_verify: true
- job_name: federation_022
scrape_interval: 5m
scrape_timeout: 1m
honor_labels: true
honor_timestamps: true
metrics_path: /prometheus/federate
params:
'match[]':
- '{job!=""}'
scheme: https
static_configs:
- targets:
- host
labels:
tenant: 022
tls_config:
insecure_skip_verify: true
中央普罗米修斯服务器
scrape_configs:
- job_name: federate
scrape_interval: 5m
scrape_timeout: 1m
honor_labels: true
honor_timestamps: true
metrics_path: /prometheus/federate
params:
'match[]':
- '{job!=""}'
scheme: https
static_configs:
- targets:
- source_host_012
- source_host_022
tls_config:
insecure_skip_verify: true
来源普罗米修斯(租户012)
来源普罗米修斯(租户022)
如果您仍然没有获得所需的标签,请尝试将relabel_configs
添加到您的federate
作业中,并尝试通过源作业名称区分度量:
relabel_configs:
- source_labels: [job]
target_label: tenant
或者从\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
relabel_configs:
- source_labels: [__address__]
target_label: tenant_host
PS:请记住,在目标重新标记完成后,以_uu开头的标签将从标签集中删除
您可能需要尝试honor\u labels=true
。如果不影响标签,请尝试重新设置标签@dmkvl我已经尊重你的标签了。。ı应该在哪里以及如何添加此relabel_配置?我在您的scrape_配置中没有看到它,您可以共享您的完整/更新的scrape_配置吗?@dmkvl,ı在回答中添加了完整内容。。是否需要在每台普罗米修斯服务器或联邦服务器中添加relabel_配置?如何添加以区分租户名称?我发布了一个答案,其中可能包含联邦成员
和源/租户
刮取作业配置。我想这应该会有帮助。ı会尝试让您知道结果,但根据我的理解,ı需要为联邦作业添加relabel_配置,而不是源prometheus?首先尝试配置源prometheus服务器,如上面的示例所示,似乎它在没有重新设置标签的情况下也能正常工作。请按照您的说明操作,但它无法正常工作。。现在kube\u pod\u container\u status\u restarts\u total也没有租户标签,请举例说明您的kube\u pod\u container\u status\u restarts\u total
metric,标签为'kube\u pod\u container\u status\u restarts\u total{app=“kube state metrics”,container=“backup”,job=“kubernetes pods”,kubernetes\u io主机名=“022-kube-master01”,kubernetes_namespace=“kube system”,kubernetes_pod_name=“kube-state-metrics-7d54b595f-r6m9k”,namespace=“database”,node=“022-kube-master01”,pod=“postgresql-postgresql-helm-backup-15876866400-7mczw”,pod_template_hash=“7d54b595f”,ready=“true”}
relabel_configs:
- source_labels: [__address__]
target_label: tenant_host