Kubernetes K8S HPA-无法从外部度量API获取度量

Kubernetes K8S HPA-无法从外部度量API获取度量,kubernetes,prometheus,k3s,Kubernetes,Prometheus,K3s,我试图让卡夫卡主题滞后到普罗米修斯,最后到APIServer,以便为我的应用程序使用外部度量HPA 我收到错误没有从外部度量API返回度量 70m Warning FailedGetExternalMetric horizontalpodautoscaler/kafkademo-hpa unable to get external metric default/kafka_lag_metric_sm0ke/&LabelSelector{MatchLa

我试图让卡夫卡主题滞后到普罗米修斯,最后到APIServer,以便为我的应用程序使用外部度量HPA

我收到错误没有从外部度量API返回度量

70m         Warning   FailedGetExternalMetric        horizontalpodautoscaler/kafkademo-hpa   unable to get external metric default/kafka_lag_metric_sm0ke/&LabelSelector{MatchLabels:map[string]string{topic: prices,},MatchExpressions:[]LabelSelectorRequirement{},}: no metrics returned from external metrics API
66m         Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/kafkademo-hpa   invalid metrics (1 invalid out of 1), first error is: failed to get external metric kafka_lag_metric_sm0ke: unable to get external metric default/kafka_lag_metric_sm0ke/&LabelSelector{MatchLabels:map[string]string{topic: prices,},MatchExpressions:[]LabelSelectorRequirement{},}: no metrics returned from external metrics API

即使在查询外部API时我可以看到以下输出,这种情况也会发生:

kubectl get --raw /apis/external.metrics.k8s.io/v1beta1 | jq
{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "external.metrics.k8s.io/v1beta1",
  "resources": [
    {
      "name": "kafka_lag_metric_sm0ke",
      "singularName": "",
      "namespaced": true,
      "kind": "ExternalMetricValueList",
      "verbs": [
        "get"
      ]
    }
  ]
}

以下是设置:

  • 卡夫卡:v2.7.0
  • 普罗米修斯:v2.26.0
  • 普罗米修斯适配器:v0.8.3
普罗米修斯适配器值

rules:
  external:
  - seriesQuery: 'kafka_consumergroup_group_lag{topic="prices"}'
    resources:
      template: <<.Resource>>
    name:
      as: "kafka_lag_metric_sm0ke"
    metricsQuery: 'avg by (topic) (round(avg_over_time(<<.Series>>{<<.LabelMatchers>>}[1m])))'

HPA信息

kubectl describe hpa kafkademo-hpa 
Name:                                       kafkademo-hpa
Namespace:                                  default
Labels:                                     <none>
Annotations:                                <none>
CreationTimestamp:                          Sat, 17 Apr 2021 20:01:29 +0300
Reference:                                  Deployment/kafkademo
Metrics:                                    ( current / target )
  "kafka_lag_metric_sm0ke" (target value):  <unknown> / 5
Min replicas:                               3
Max replicas:                               12
Deployment pods:                            3 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetExternalMetric  the HPA was unable to compute the replica count: unable to get external metric default/kafka_lag_metric_sm0ke/&LabelSelector{MatchLabels:map[string]string{topic: prices,},MatchExpressions:[]LabelSelectorRequirement{},}: no metrics returned from external metrics API
Events:
  Type     Reason                        Age                     From                       Message
  ----     ------                        ----                    ----                       -------
  Warning  FailedComputeMetricsReplicas  70m (x335 over 155m)    horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get external metric kafka_lag_metric_sm0ke: unable to get external metric default/kafka_lag_metric_sm0ke/&LabelSelector{MatchLabels:map[string]string{topic: prices,},MatchExpressions:[]LabelSelectorRequirement{},}: no metrics returned from external metrics API
  Warning  FailedGetExternalMetric       2m30s (x366 over 155m)  horizontal-pod-autoscaler  unable to get external metric default/kafka_lag_metric_sm0ke/&LabelSelector{MatchLabels:map[string]string{topic: prices,},MatchExpressions:[]LabelSelectorRequirement{},}: no metrics returned from external metrics API
我可以看到“items”字段为空。这是什么意思

我似乎不理解的是幕后发生的一连串事件

好吧,事情就是这样这是否正确?

  • prometheus适配器查询prometheus,执行seriesQuery,计算metricsQuery并创建“kafka\u lag\u metric\u sm0ke”
  • 它向api服务器注册一个端点以获取外部指标
  • API服务器将根据该端点定期更新其统计信息
  • HPA从API服务器检查“kafka_lag_metric_sm0ke”,并根据提供的值执行缩放

我似乎也不明白名称空间在这一切中的重要性。我可以看到stat是名称空间的。这是否意味着每个命名空间将有1个stat?这有什么意义呢?

长期以来,我都是在问完问题后再回答自己的问题,下面是上述配置的错误之处

该错误存在于普罗米修斯适配器yaml中:

rules:
  external:
    - seriesQuery: 'kafka_consumergroup_group_lag{topic="prices"}'
      resources:
        template: <<.Resource>>
      name:
        as: "kafka_lag_metric_sm0ke"
      metricsQuery: 'avg by (topic) (round(avg_over_time(<<.Series>>{<<.LabelMatchers>>}[1m])))'
我仍然不确定它为什么会起作用。我知道在这种情况下,
将被不生成有效查询的内容所取代,但我不知道它是什么

kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/default/kafka_lag_metric_sm0ke |jq
{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": []
}

rules:
  external:
    - seriesQuery: 'kafka_consumergroup_group_lag{topic="prices"}'
      resources:
        template: <<.Resource>>
      name:
        as: "kafka_lag_metric_sm0ke"
      metricsQuery: 'avg by (topic) (round(avg_over_time(<<.Series>>{<<.LabelMatchers>>}[1m])))'
kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/default/kafka_lag_metric_sm0ke |jq
{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "metricName": "kafka_lag_metric_sm0ke",
  "metricLabels": {
        "topic": "prices"
      },
      "timestamp": "2021-04-21T16:55:18Z",
      "value": "0"
    }
  ]
}