Kubernetes HPA不';不要扩大规模
这是非常奇怪的今天,我用了AWS EKS集群,它的工作很好,我的HPA昨天和今天上午。从下午开始,没什么变化,我的HPA突然不工作 这是我的HPA:Kubernetes HPA不';不要扩大规模,kubernetes,autoscaling,amazon-eks,hpa,Kubernetes,Autoscaling,Amazon Eks,Hpa,这是非常奇怪的今天,我用了AWS EKS集群,它的工作很好,我的HPA昨天和今天上午。从下午开始,没什么变化,我的HPA突然不工作 这是我的HPA: apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: my_hpa_name namespace: default spec: scaleTargetRef: apiVersion: apps/v1 kind: Depl
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: my_hpa_name
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my_deployment_name
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: my_metrics # MUST match the metrics on custom_metrics API
target:
type: AverageValue
averageValue: 5
behavior:
scaleUp:
stabilizationWindowSeconds: 30 # window to consider waiting while scaling Up. default is 0s if empty.
scaleDown:
stabilizationWindowSeconds: 300 # window to consider waiting while scaling down. default is 300s if empty.
当我开始测试时,我做了很多尝试,但都失败了:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
xxxx-hpa Deployment/xxxx-deployment <unknown>/5 1 10 0 5s
xxxx-hpa Deployment/xxxx-deployment 0/5 1 10 1 16s
xxxx-hpa Deployment/xxxx-deployment 10/5 1 10 1 3m4s
xxxx-hpa Deployment/xxxx-deployment 9/5 1 10 1 7m38s
xxxx-hpa Deployment/xxxx-deployment 10/5 1 10 1 8m9s
但是现在,它不起作用了。我已经尝试为其他指标增加新的HPA,它是有效的。就这个。奇怪
新编辑: 由于EKS群集,这是可能的,正如我看到的:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-27-177-146.us-west-2.compute.internal Ready <none> 14h v1.18.9-eks-d1db3c
ip-172-27-183-31.us-west-2.compute.internal Ready,SchedulingDisabled <none> 15h v1.18.9-eks-d1db3c
kubectl获取节点
姓名状态角色年龄版本
ip-172-27-177-146.us-west-2.compute.internal Ready 14h v1.18.9-eks-d1db3c
ip-172-27-183-31.us-west-2.compute.internal Ready,Scheduling Disabled 15h v1.18.9-eks-d1db3c
SchedulingDisabled是否意味着集群不足以容纳新的POD?我想到的一件事是,您的服务器可能无法正常运行。如果没有来自的数据,水平吊舱自动缩放将无法工作。解决了这个问题。这是EKS集群问题。我有最多2个按需节点和最多2个现场节点的资源限制。需要增加群集节点。我认为如果metrics server不工作,OP也不会有资源统计信息。同意,我使用的是cusstom metrics,并且我已经设置了Prometheus和适配器。到,必须至少有10个队列大小才能从1个pod扩展到2个pod。它是否持续保持在该水平以上足够长的时间,以使HPA注意到?如果它是6/5,它仍然会上升。请参阅上面的同一文档:在全局可配置的公差范围内,从--horizontal pod autoscaler公差标志(默认为0.1)开始,您是对的,它是
ceil[…]
而不是floor[…]
。所以6/5是1.2,高于公差阈值,这四舍五入为2。我不知道为什么它不能扩展。你是ip-172-27-183-31.us-west-2.compute.internal主节点吗?是否可以检查此节点的日志并对其进行描述?如果不是主设备,您可以启用调度吗?通过DevOps检查,我们有两个用于NDOE的自动缩放组。group1是随需应变组,最少1个节点,最多2个节点。Ground2为点群,最小1个节点,最大2个节点。如果没有任何活动,那么那里应该只有一个节点。
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-hpa Deployment/my-deployment 0/5 1 10 1 26s
my-hpa Deployment/my-deployment 0/5 1 10 1 46s
my-hpa Deployment/my-deployment 8/5 1 10 1 6m21s
my-hpa Deployment/my-deployment 8/5 1 10 2 6m36s
my-hpa Deployment/my-deployment 8/5 1 10 2 6m52s
my-hpa Deployment/my-deployment 8/5 1 10 4 7m7s
my-hpa Deployment/my-deployment 7/5 1 10 4 7m38s
my-hpa Deployment/my-deployment 6750m/5 1 10 6 7m55s
kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-27-177-146.us-west-2.compute.internal Ready <none> 14h v1.18.9-eks-d1db3c
ip-172-27-183-31.us-west-2.compute.internal Ready,SchedulingDisabled <none> 15h v1.18.9-eks-d1db3c