Kubernetes HPA不';不要扩大规模

Kubernetes HPA不';不要扩大规模,kubernetes,autoscaling,amazon-eks,hpa,Kubernetes,Autoscaling,Amazon Eks,Hpa,这是非常奇怪的今天,我用了AWS EKS集群,它的工作很好,我的HPA昨天和今天上午。从下午开始,没什么变化,我的HPA突然不工作 这是我的HPA: apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: my_hpa_name namespace: default spec: scaleTargetRef: apiVersion: apps/v1 kind: Depl

这是非常奇怪的今天,我用了AWS EKS集群,它的工作很好,我的HPA昨天和今天上午。从下午开始,没什么变化,我的HPA突然不工作

这是我的HPA:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: my_hpa_name
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my_deployment_name
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Pods
      pods:
        metric:
          name: my_metrics # MUST match the metrics on custom_metrics API
        target:
          type: AverageValue
          averageValue: 5
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 30 # window to consider waiting while scaling Up. default is 0s if empty.
    scaleDown:
      stabilizationWindowSeconds: 300 # window to consider waiting while scaling down. default is 300s if empty.
当我开始测试时,我做了很多尝试,但都失败了:

NAME                        REFERENCE                                   TARGETS       MINPODS   MAXPODS   REPLICAS   AGE
xxxx-hpa   Deployment/xxxx-deployment   <unknown>/5   1         10        0          5s
xxxx-hpa   Deployment/xxxx-deployment   0/5           1         10        1          16s
xxxx-hpa   Deployment/xxxx-deployment   10/5          1         10        1          3m4s
xxxx-hpa   Deployment/xxxx-deployment   9/5           1         10        1          7m38s
xxxx-hpa   Deployment/xxxx-deployment   10/5          1         10        1          8m9s
但是现在,它不起作用了。我已经尝试为其他指标增加新的HPA,它是有效的。就这个。奇怪


新编辑: 由于EKS群集,这是可能的,正如我看到的:

kubectl get nodes
NAME                                           STATUS                     ROLES    AGE   VERSION
ip-172-27-177-146.us-west-2.compute.internal   Ready                      <none>   14h   v1.18.9-eks-d1db3c
ip-172-27-183-31.us-west-2.compute.internal    Ready,SchedulingDisabled   <none>   15h   v1.18.9-eks-d1db3c
kubectl获取节点
姓名状态角色年龄版本
ip-172-27-177-146.us-west-2.compute.internal Ready 14h v1.18.9-eks-d1db3c
ip-172-27-183-31.us-west-2.compute.internal Ready,Scheduling Disabled 15h v1.18.9-eks-d1db3c

SchedulingDisabled是否意味着集群不足以容纳新的POD?

我想到的一件事是,您的服务器可能无法正常运行。如果没有来自的数据,水平吊舱自动缩放将无法工作。

解决了这个问题。这是EKS集群问题。我有最多2个按需节点和最多2个现场节点的资源限制。需要增加群集节点。

我认为如果metrics server不工作,OP也不会有资源统计信息。同意,我使用的是cusstom metrics,并且我已经设置了Prometheus和适配器。到,必须至少有10个队列大小才能从1个pod扩展到2个pod。它是否持续保持在该水平以上足够长的时间,以使HPA注意到?如果它是6/5,它仍然会上升。请参阅上面的同一文档:在全局可配置的公差范围内,从--horizontal pod autoscaler公差标志(默认为0.1)开始,您是对的,它是
ceil[…]
而不是
floor[…]
。所以6/5是1.2,高于公差阈值,这四舍五入为2。我不知道为什么它不能扩展。你是ip-172-27-183-31.us-west-2.compute.internal主节点吗?是否可以检查此节点的日志并对其进行描述?如果不是主设备,您可以启用调度吗?通过DevOps检查,我们有两个用于NDOE的自动缩放组。group1是随需应变组,最少1个节点,最多2个节点。Ground2为点群,最小1个节点,最大2个节点。如果没有任何活动,那么那里应该只有一个节点。
NAME           REFERENCE          TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
my-hpa   Deployment/my-deployment   0/5       1         10        1          26s
my-hpa   Deployment/my-deployment   0/5       1         10        1          46s
my-hpa   Deployment/my-deployment   8/5       1         10        1          6m21s
my-hpa   Deployment/my-deployment   8/5       1         10        2          6m36s
my-hpa   Deployment/my-deployment   8/5       1         10        2          6m52s
my-hpa   Deployment/my-deployment   8/5       1         10        4          7m7s
my-hpa   Deployment/my-deployment   7/5       1         10        4          7m38s
my-hpa   Deployment/my-deployment   6750m/5   1         10        6          7m55s
kubectl get nodes
NAME                                           STATUS                     ROLES    AGE   VERSION
ip-172-27-177-146.us-west-2.compute.internal   Ready                      <none>   14h   v1.18.9-eks-d1db3c
ip-172-27-183-31.us-west-2.compute.internal    Ready,SchedulingDisabled   <none>   15h   v1.18.9-eks-d1db3c