Google cloud platform GCP Nat后超时_Google Cloud Platform_Nat_Google Cloud Networking

Google cloud platform GCP Nat后超时

google-cloud-platform

Google cloud platform GCP Nat后超时,google-cloud-platform,nat,google-cloud-networking,Google Cloud Platform,Nat,Google Cloud Networking,我似乎有一个类似于问题的问题，在GCP上使用gitlab runner构建docker映像时，我会遇到以下超时 Put https://registry.gitlab.com/v2/[redacted-repo]: dial tcp 35.227.35.254:443: i/o timeout 此时，我的google cloud NAT提供以下日志输出： { "insertId": "rh7b2jfleq0wx", "jsonPayload": { "allocation

我似乎有一个类似于问题的问题，在GCP上使用gitlab runner构建docker映像时，我会遇到以下超时

Put https://registry.gitlab.com/v2/[redacted-repo]: dial tcp 35.227.35.254:443: i/o timeout

此时，我的google cloud NAT提供以下日志输出：

{
   "insertId": "rh7b2jfleq0wx",
   "jsonPayload": {
     "allocation_status": "DROPPED",
     "endpoint": {
       "project_id": "gitlab-autoscale-runners",
       "vm_name": "runner-5dblbjek-auto-scale-runner-1589446683-0b220f90",
       "region": "europe-west4",
       "zone": "europe-west4-b"
     },
     "connection": {
       "protocol": 6,
       "src_port": 42446,
       "src_ip": "some-ip",
       "dest_ip": "some-ip",
       "dest_port": 443
     },
     "vpc": {
       "vpc_name": "default",
       "subnetwork_name": "default",
       "project_id": "gitlab-autoscale-runners"
     },
     "gateway_identifiers": {
       "gateway_name": "gitlab-runner-gateway",
       "router_name": "gitlab-runner-router",
       "region": "europe-west4"
     }
   },
   "resource": {
     "type": "nat_gateway",
     "labels": {
       "region": "europe-west4",
       "router_id": "7964886332834186727",
       "gateway_name": "gitlab-runner-gateway",
       "project_id": "gitlab-autoscale-runners"
     }
   },
   "timestamp": "2020-05-14T10:17:55.195614735Z",
   "labels": {
     "nat.googleapis.com/nat_ip": "",
     "nat.googleapis.com/instance_name": "runner-5dblbjek-auto-scale-runner-1589446683-0b220f90",
     "nat.googleapis.com/network_name": "default",
     "nat.googleapis.com/subnetwork_name": "default",
     "nat.googleapis.com/router_name": "gitlab-runner-router",
     "nat.googleapis.com/instance_zone": "europe-west4-b"
   },
   "logName": "projects/gitlab-autoscale-runners/logs/compute.googleapis.com%2Fnat_flows",
   "receiveTimestamp": "2020-05-14T10:18:00.422135520Z"
 }

上述问题似乎表明NAT端口使用过度存在问题。我已经通过使用google cloud CLI确认这在我们的案例中不是一个问题，见下文

$ gcloud compute routers get-nat-mapping-info gitlab-runner-router
---
instanceName: runner-5dblbjek-auto-scale-runner-1589446683-0b220f90
interfaceNatMappings:
- natIpPortRanges:
  - some-id:1024-1055
  numTotalDrainNatPorts: 0
  numTotalNatPorts: 32
  sourceAliasIpRange: ''
  sourceVirtualIp: some-ip
- natIpPortRanges:
  - some-ip:32768-32799
  numTotalDrainNatPorts: 0
  numTotalNatPorts: 32
  sourceAliasIpRange: ''
  sourceVirtualIp: some-ip

我似乎只使用了64个端口

google云路由器公布以下状态：

kind: compute#routerStatusResponse
result:
  natStatus:
  - autoAllocatedNatIps:
    - some-ip
    minExtraNatIpsNeeded: 0
    name: gitlab-runner-gateway
    numVmEndpointsWithNatMappings: 3
  network: https://www.googleapis.com/compute/v1/projects/gitlab-autoscale-runners/global/networks/default

当本地运行或在共享gitlab runner中运行时，即不在NAT后面时，相同的docker映像会成功构建

在google cloud nat后面构建此docker映像时，如何防止超时？

查看cloud nat输出，它显示分配已删除的状态。建议的操作是将每个VM实例的最小端口数增加到足够的4096个端口，并让它运行几天。我建议这个数字是为了达到我们将停止滴水的目的，如果这有助于减少滴水，我们可能需要增加更多的系数2，直到没有滴水。如果您在4k端口上没有收到任何丢弃状态，您可以将其减小，直到找到一个中间值，您不再收到丢弃状态，也没有足够的NAT端口打开。您正在连接64个连接。port usage表示到单个唯一目标IP的连接数：端口，VM的协议

查看您的配置，目前每个VM分配了64个端口，如您在说明中所述。这意味着，vpc中的每个VM实例都会获得64个NAT IP:端口组合以进行外部连接。这意味着，您可以连接到64个唯一的目标IP地址、目标端口和协议。当您进行测试时，看起来您正在达到该极限

分配给云NAT网关的每个NAT IP都有64512个端口，因此，默认配置为每个虚拟机64个端口，NAT网关会将64个NAT IP:端口的块分配给NAT网关配置中选择的指定子网内的每个虚拟机。因此，这意味着您可以使用此配置运行1008 VM 64512除以64。但每个vm可以同时连接到64个唯一的目标。现在，根据您的应用程序/用例，如果您需要更多的同时连接，则需要增加每个VM的最小端口数

例如，使用1个NAT IP和每个VM 1024个最小端口，您可以运行63个VM。每个虚拟机可以连接到1024个唯一目标。如果您需要运行更多VM，则需要分配更多NAT IP。通过添加第二个IP，您可以将NAT容量增加一倍。由于您选择了NAT IP的自动分配，当您在子网中创建更多VM时，NAT IP将自动创建和分配。在这种情况下，您只需要调整每个vm的最小端口配置，以满足您的流量需求

请注意，一旦连接终止，NAT网关就有2分钟的计时器，然后才能使用NAT IP:端口。[1] ，使端口配置略高于峰值流量

有关端口计算的更多详细信息，请参见[2]

[1]

[2]

查看云NAT输出，它显示分配已删除的状态。建议的操作是将每个VM实例的最小端口数增加到足够的4096个端口，并让它运行几天。我建议这个数字是为了达到我们将停止滴水的目的，如果这有助于减少滴水，我们可能需要增加更多的系数2，直到没有滴水。如果您在4k端口上没有收到任何丢弃状态，您可以将其减小，直到找到一个中间值，您不再收到丢弃状态，也没有足够的NAT端口打开。您正在连接64个连接。port usage表示到单个唯一目标IP的连接数：端口，VM的协议

例如，使用1个NAT IP和每个VM 1024个最小端口，您可以运行63个VM。您可以连接到1024 每个vm的唯一目标。如果您需要运行更多VM，则需要分配更多NAT IP。通过添加第二个IP，您可以将NAT容量增加一倍。由于您选择了NAT IP的自动分配，当您在子网中创建更多VM时，NAT IP将自动创建和分配。在这种情况下，您只需要调整每个vm的最小端口配置，以满足您的流量需求

请注意，一旦连接终止，NAT网关就有2分钟的计时器，然后才能使用NAT IP:端口。[1] ，使端口配置略高于峰值流量

有关端口计算的更多详细信息，请参见[2]

[1]

[2]

感谢您的详尽回答；初步结果似乎表明成功了，如果它在未来几天有效，我将接受你的回答。虽然你的回答在前两天似乎帮助很大，但之后问题似乎又出现了。看起来好像有什么东西没有被正确地释放，这一直在消耗端口；初步结果似乎表明成功了，如果它在未来几天有效，我将接受你的回答。虽然你的回答在前两天似乎帮助很大，但之后问题似乎又出现了。看起来好像有什么东西没有被正确释放，这一直在消耗端口。