Azure容器注册表映像提取速度非常慢,映像大小约为150 MB
将映像部署到AKS实例时,从ACR(高级SKU)提取映像的速度非常慢,即使对于大小约为150 MB的“小”映像也是如此 AKS资源和ACR资源均位于加拿大东部地区 以下是一个例子:Azure容器注册表映像提取速度非常慢,映像大小约为150 MB,azure,azure-aks,azure-container-registry,Azure,Azure Aks,Azure Container Registry,将映像部署到AKS实例时,从ACR(高级SKU)提取映像的速度非常慢,即使对于大小约为150 MB的“小”映像也是如此 AKS资源和ACR资源均位于加拿大东部地区 以下是一个例子: root@076fff2831b2:/tmp# kubectl describe pod application-service-59bcf96874-pvrmb Name: application-service-59bcf96874-pvrmb Namespace: default
root@076fff2831b2:/tmp# kubectl describe pod application-service-59bcf96874-pvrmb
Name: application-service-59bcf96874-pvrmb
Namespace: default
Priority: 0
Node: aks-41067869-1/10.255.13.163
Start Time: Tue, 11 Feb 2020 18:15:53 -0500
Labels: app.kubernetes.io/instance=application-service
app.kubernetes.io/name=application-service
pod-template-hash=59bcf96874
Annotations: <none>
Status: Running
IP: 10.255.13.175
IPs: <none>
Controlled By: ReplicaSet/application-service-59bcf96874
Containers:
application-service:
Container ID: docker://0e86526a293d9055d482a09f043f0be68c594244fe4216f8fb190bc2caf6b65b
Image: myacr01.azurecr.io/microservices/application-service:0.0.6
Image ID: docker-pullable://myacr01.azurecr.io/microservices/application-service@sha256:cfbb3ffa7adc52da9cc0b8d7f78376076ea712025b59df8e406c559d369f4085
Port: 3000/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 11 Feb 2020 18:35:00 -0500
Finished: Tue, 11 Feb 2020 18:35:00 -0500
Ready: False
Restart Count: 5
Liveness: http-get https://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
PORT: 3000
undefined: undefined
Mounts:
/kvmnt from application-service-kv-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from application-service-token-9jk8j (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
application-service-kv-volume:
Type: FlexVolume (a generic volume resource that is provisioned/attached using an exec based plugin)
Driver: azure/kv
FSType:
SecretRef: &LocalObjectReference{Name:kvcreds,}
ReadOnly: false
Options: map[keyvaultname:testIt2 keyvaultobjectnames:APPLICATION-SVC-SQLDB-CS;INGESTION-CONSUMER-EHB-CS;INGESTION-PRODUCER-EHB-CS keyvaultobjecttypes:secret;secret;secret tenantid:REMOVED usepodidentity:false usevmmanagedidentity:false]
application-service-token-9jk8j:
Type: Secret (a volume populated by a Secret)
SecretName: application-service-token-9jk8j
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 20m default-scheduler Successfully assigned default/application-service-59bcf96874-pvrmb to aks-41067869-1
Normal Pulling 20m kubelet, aks-41067869-1 Pulling image "myacr01.azurecr.io/microservices/application-service:0.0.6"
Normal Pulled 4m39s kubelet, aks-41067869-1 Successfully pulled image "myacr01.azurecr.io/microservices/application-service:0.0.6"
Normal Started 3m36s (x4 over 4m33s) kubelet, aks-41067869-1 Started container application-service
Warning BackOff 3m4s (x11 over 4m30s) kubelet, aks-41067869-1 Back-off restarting failed container
Normal Pulled 2m52s (x4 over 4m32s) kubelet, aks-41067869-1 Container image "myacr01.azurecr.io/microservices/application-service:0.0.6" already present on machine
Normal Created 2m51s (x5 over 4m33s) kubelet, aks-41067869-1 Created container application-service
root@076fff2831b2:/tmp#kubectl描述pod应用程序-服务-59bcf96874-pvrmb
名称:application-service-59bcf96874-pvrmb
名称空间:默认值
优先级:0
节点:aks-41067869-1/10.255.13.163
开始时间:2020年2月11日星期二18:15:53-0500
标签:app.kubernetes.io/instance=应用程序服务
app.kubernetes.io/name=应用程序服务
pod模板哈希=59bcf96874
注释:
状态:正在运行
IP:10.255.13.175
IPs:
控制人:ReplicaSet/application-service-59bcf96874
容器:
申请服务:
容器ID:docker://0e86526a293d9055d482a09f043f0be68c594244fe4216f8fb190bc2caf6b65b
图片:myacr01.azurecr.io/microservices/applicationservice:0.0.6
图像ID:docker-pullable://myacr01.azurecr.io/microservices/application-service@sha256:cfbb3ffa7adc52da9cc0b8d7f78376076ea712025b59df8e406c559d369f4085
端口:3000/TCP
主机端口:0/TCP
国家:等待
原因:仓促退却
最后状态:终止
原因:错误
退出代码:1
开始时间:2020年2月11日星期二18:35:00-0500
完成时间:2020年2月11日星期二18:35:00-0500
就绪:错误
重新启动计数:5
活跃度:http get https://:http/delay=0s超时=1s周期=10s成功=1失败=3
准备就绪:http get https://:http/delay=0s超时=1s周期=10s成功=1失败=3
环境:
港口:3000
未定义:未定义
挂载:
/来自应用服务kv容量(ro)的kvmnt
/var/run/secrets/kubernetes.io/servicecount来自application-service-token-9jk8j(ro)
条件:
类型状态
初始化为True
准备错误
集装箱准备好了吗
播客预定为真
卷数:
应用服务量:
类型:FlexVolume(使用基于exec的插件配置/连接的通用卷资源)
驱动器:azure/kv
FSType:
SecretRef:&LocalObjectReference{Name:kvcreds,}
只读:false
选项:映射[keyvaultname:testIt2 KeyvaultObjectName:APPLICATION-SVC-SQLDB-CS;INGESTION-CONSUMER-EHB-CS;INGESTION-PRODUCER-EHB-CS keyvaultobjecttypes:secret;secret;secret-tenantid:REMOVED usepodidentity:false usevmmanagedidentity:false]
application-service-token-9jk8j:
类型:Secret(由Secret填充的卷)
SecretName:application-service-token-9jk8j
可选:false
QoS等级:最佳努力
节点选择器:
容差:node.kubernetes.io/未就绪:不执行300秒
node.kubernetes.io/不可访问:不执行300秒
活动:
从消息中键入原因年龄
---- ------ ---- ---- -------
正常计划的20m默认计划程序已成功将默认/应用程序服务-59bcf96874-pvrmb分配给aks-41067869-1
正常拉伸20m kubelet,aks-41067869-1拉伸图像“myacr01.azurecr.io/microservices/application service:0.0.6”
正常拉取4m39s kubelet,aks-41067869-1成功拉取图像“myacr01.azurecr.io/microservices/application service:0.0.6”
正常启动3m36s(x4超过4m33s)kubelet,aks-41067869-1启动集装箱应用服务
警告后退3m4s(x11超过4m30s)kubelet,aks-41067869-1后退重新启动失败的容器
正常拉动2m52s(x4/4m32s)kubelet,aks-41067869-1容器映像“myacr01.azurecr.io/microservices/application service:0.0.6”已出现在机器上
正常创建的2m51s(x5超过4m33s)kubelet,aks-41067869-1创建的容器应用程序服务
出于隐私原因,修改/删除了一些详细信息
然而,需要注意的是,对于来自ACR的图像,从“拉动”状态到“拉动”状态需要约15米的距离
这个问题每天都在发生。AKS实例的Azure Insights刀片显示,在过去7天内,最大节点CPU利用率为26%,节点内存利用率为14.32%
我们如何进一步对此进行故障排除,以确定延迟的可能原因
非常感谢您的帮助
谢谢 听到很多人抱怨说加拿大东部的产能不足。这可能与您的具体情况有关,也可能与您的具体情况无关。恐怕联系azure支持是您唯一的选择:问题似乎不在您这边……谢谢CSharpRocks和morgwai。我会试试Azure支持。你听说过这方面的事情吗?9个月后,我发现基于mcr.microsoft.com/dotnet/framework提取一个图像需要15分钟以上。这使得集群扩展对节点耗尽的响应非常缓慢,因为节点似乎不共享容器缓存,所以它们每个都必须在启动之前提取映像。