Kubernetes 如何获得一个pod';从Kube State metrics中提取度量值时,Prometheus中的s标签
我有一个普罗米修斯吊舱和我的库贝州度量(KSM)吊舱一起运行。KSM收集集群中所有名称空间中所有POD的所有度量。普罗米修斯只需从KSM中获取指标,这样普罗米修斯就不需要获取单个豆荚 部署pod时,其部署具有某些与pod相关的标签,如下所示。他们有两个重要的标签:应用程序和团队:Kubernetes 如何获得一个pod';从Kube State metrics中提取度量值时,Prometheus中的s标签,kubernetes,prometheus,prometheus-alertmanager,Kubernetes,Prometheus,Prometheus Alertmanager,我有一个普罗米修斯吊舱和我的库贝州度量(KSM)吊舱一起运行。KSM收集集群中所有名称空间中所有POD的所有度量。普罗米修斯只需从KSM中获取指标,这样普罗米修斯就不需要获取单个豆荚 部署pod时,其部署具有某些与pod相关的标签,如下所示。他们有两个重要的标签:应用程序和团队: apiVersion: apps/v1 kind: Deployment metadata: labels: APP: AppABC TEAM: TeamABC ... 在普罗米修斯内部,我的刮
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
APP: AppABC
TEAM: TeamABC
...
在普罗米修斯内部,我的刮片配置如下所示:
scrape_configs:
- job_name: 'pod monitoring'
honor_labels: true
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
...
- name: Pod_Failed
rules:
- alert: pod_failed
expr: kube_pod_status_phase{phase="Failed"} * on (pod,namespace) group_right kube_pod_labels > 0
labels:
appname: '{{ $labels.label_APP }}' # This is what I wanted to capture
teamname: '{{ $labels.label_TEAM }}' # This is what I wanted to capture
annotations:
summary: 'Pod: {{ $labels.pod }} is down'
description: 'Pod: {{ $labels.pod }} is down in {{ $labels.namespace }} namespace.'
- name: Pod_Failed
rules:
- alert: pod_failed
expr: kube_pod_status_phase{phase="Failed"} * on (pod,namespace) group_right kube_pod_labels > 0
labels:
appname: '{{ $labels.label_APP }}' # This is what I wanted to capture
teamname: '{{ $labels.label_TEAM }}' # This is what I wanted to capture
annotations:
summary: 'Pod: {{ $labels.pod }} is down'
description: 'Pod: {{ $labels.pod }} is down in {{ $labels.namespace }} namespace.'
问题是,当普罗米修斯从库贝状态度量中获取信息时,它会用库贝状态度量覆盖应用程序
。e、 g.下面的这个指标实际上是针对一个名为“AppABC”的应用程序,但普罗米修斯将应用程序
标签改写为库贝州指标
kube_pod_container_status_restarts_total{
app="kube-state-metrics",
container="appabccontainer",
job="pod monitoring",
namespace="test-namespace",
pod="appabc-766cbcb68d-29smr"
}
我是否可以从kube state metrics中刮取度量值,但保留应用程序和团队标签,而不覆盖它们
编辑-我想出来了
我的问题:我的部署和POD定义了某些标签(应用程序、团队)。Kube state metrics从K8API获取这些数据。当普罗米修斯从库贝州的指标中剔除时,它没有这些标签
我的目标:将这些标签暴露给普罗米修斯
我的解决方案:使用ProMQ,您可以分组。所以在我的prometheus-rules.yaml中,我改变了这一点:
expr: kube_pod_status_phase{phase="Failed"} > 0
expr: kube_pod_status_phase{phase="Failed"} > 0
为此:
expr: kube_pod_status_phase{phase="Failed"} * on (pod,namespace) group_right kube_pod_labels > 0
expr: kube_pod_status_phase{phase="Failed"} * on (pod,namespace) group_right kube_pod_labels > 0
因此,我的新警报规则如下所示:
scrape_configs:
- job_name: 'pod monitoring'
honor_labels: true
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
...
- name: Pod_Failed
rules:
- alert: pod_failed
expr: kube_pod_status_phase{phase="Failed"} * on (pod,namespace) group_right kube_pod_labels > 0
labels:
appname: '{{ $labels.label_APP }}' # This is what I wanted to capture
teamname: '{{ $labels.label_TEAM }}' # This is what I wanted to capture
annotations:
summary: 'Pod: {{ $labels.pod }} is down'
description: 'Pod: {{ $labels.pod }} is down in {{ $labels.namespace }} namespace.'
- name: Pod_Failed
rules:
- alert: pod_failed
expr: kube_pod_status_phase{phase="Failed"} * on (pod,namespace) group_right kube_pod_labels > 0
labels:
appname: '{{ $labels.label_APP }}' # This is what I wanted to capture
teamname: '{{ $labels.label_TEAM }}' # This is what I wanted to capture
annotations:
summary: 'Pod: {{ $labels.pod }} is down'
description: 'Pod: {{ $labels.pod }} is down in {{ $labels.namespace }} namespace.'
解决方案:使用ProMQ,您可以分组。所以在我的prometheus-rules.yaml中,我改变了这一点:
expr: kube_pod_status_phase{phase="Failed"} > 0
expr: kube_pod_status_phase{phase="Failed"} > 0
为此:
expr: kube_pod_status_phase{phase="Failed"} * on (pod,namespace) group_right kube_pod_labels > 0
expr: kube_pod_status_phase{phase="Failed"} * on (pod,namespace) group_right kube_pod_labels > 0
因此,我的新警报规则如下所示:
scrape_configs:
- job_name: 'pod monitoring'
honor_labels: true
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
...
- name: Pod_Failed
rules:
- alert: pod_failed
expr: kube_pod_status_phase{phase="Failed"} * on (pod,namespace) group_right kube_pod_labels > 0
labels:
appname: '{{ $labels.label_APP }}' # This is what I wanted to capture
teamname: '{{ $labels.label_TEAM }}' # This is what I wanted to capture
annotations:
summary: 'Pod: {{ $labels.pod }} is down'
description: 'Pod: {{ $labels.pod }} is down in {{ $labels.namespace }} namespace.'
- name: Pod_Failed
rules:
- alert: pod_failed
expr: kube_pod_status_phase{phase="Failed"} * on (pod,namespace) group_right kube_pod_labels > 0
labels:
appname: '{{ $labels.label_APP }}' # This is what I wanted to capture
teamname: '{{ $labels.label_TEAM }}' # This is what I wanted to capture
annotations:
summary: 'Pod: {{ $labels.pod }} is down'
description: 'Pod: {{ $labels.pod }} is down in {{ $labels.namespace }} namespace.'
你能发布库贝州度量标准中的度量标准吗?它们可能位于kube state metrics:8080/metrics的某个地方,我想我找到了。我把我的解决方案贴在编辑栏下。