Kubernetes RabbitMQ节点无法相互发现并加入群集

Kubernetes RabbitMQ节点无法相互发现并加入群集,kubernetes,rabbitmq,rabbitmqctl,Kubernetes,Rabbitmq,Rabbitmqctl,我是RabbitMQ新手,尝试使用Statefulset设置高可用性队列。我遵循的教程是 将statefulset和服务部署到kubernetes之后, 节点无法在群集中发现彼此,pod进入状态:CrashLoopBackOff。对等发现似乎未按预期工作,节点无法加入群集中 我的群集节点是 rabbit@rabbitmq-0, rabbit@rabbitmq-1及rabbit@rabbitmq-二, $kubectl执行官-it rabbitmq-0/bin/sh / # rabbitmqctl

我是RabbitMQ新手,尝试使用Statefulset设置高可用性队列。我遵循的教程是

将statefulset和服务部署到kubernetes之后, 节点无法在群集中发现彼此,pod进入状态:CrashLoopBackOff。对等发现似乎未按预期工作,节点无法加入群集中

我的群集节点是 rabbit@rabbitmq-0, rabbit@rabbitmq-1及rabbit@rabbitmq-二,

$kubectl执行官-it rabbitmq-0/bin/sh

/ # rabbitmqctl status
Status of node 'rabbit@rabbitmq-0'
Error: unable to connect to node 'rabbit@rabbitmq-0': nodedown

DIAGNOSTICS
===========

attempted to contact: ['rabbit@rabbitmq-0']

rabbit@rabbitmq-0:
  * connected to epmd (port 4369) on rabbitmq-0
  * epmd reports: node 'rabbit' not running at all
                  no other nodes on rabbitmq-0
  * suggestion: start the node

current node details:
- node name: 'rabbitmq-cli-22@rabbitmq-0'
- home dir: /var/lib/rabbitmq
- cookie hash: 5X3n5Gy+r4FL+M53FHwv3w==
rabbitmq.conf

 { rabbit, [
  { loopback_users, [ ] },
  { tcp_listeners, [ 5672 ] },
  { ssl_listeners, [ ] },
  { hipe_compile, false },
  { cluster_nodes, { [ rabbit@rabbitmq-0, rabbit@rabbitmq-1, rabbit@rabbitmq-2], disc } },
  {ssl_listeners, [5671]},
  {ssl_options, [{cacertfile,"/etc/rabbitmq/ca_certificate.pem"},
    {certfile,"/etc/rabbitmq/server_certificate.pem"},
    {keyfile,"/etc/rabbitmq/server_key.pem"},
    {verify,verify_peer},
    {versions, ['tlsv1.2', 'tlsv1.1']}
    {fail_if_no_peer_cert,false}]}
] },
  { rabbitmq_management, [ { listener, [
  { port, 15672 },
  { ssl, false }
] } ] }
].
$kubectl获取状态集rabbitmq

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: rabbitmq
  name: rabbitmq
  namespace: development
  resourceVersion: "119265565"
  selfLink: /apis/apps/v1/namespaces/development/statefulsets/rabbitmq
  uid: 10c2fabc-cbb3-11e7-8821-00505695519e
spec:
  podManagementPolicy: OrderedReady
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: rabbitmq
  serviceName: rabbitmq
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: rabbitmq
    spec:
      containers:
      - env:
        - name: RABBITMQ_ERLANG_COOKIE
          valueFrom:
            secretKeyRef:
              key: rabbitmq-erlang-cookie
              name: rabbitmq-erlang-cookie
        image: rabbitmq:1.0
        imagePullPolicy: IfNotPresent
        lifecycle:
          postStart:
            exec:
              command:
              - /bin/sh
              - -c
              - |
                if [ -z "$(grep rabbitmq /etc/resolv.conf)" ]; then
                  sed "s/^search \([^ ]\+\)/search rabbitmq.\1 \1/" /etc/resolv.conf > /etc/resolv.conf.new;
                  cat /etc/resolv.conf.new > /etc/resolv.conf;
                  rm /etc/resolv.conf.new;
                fi; until rabbitmqctl node_health_check; do sleep 1; done; if [[ "$HOSTNAME" != "rabbitmq-0" && -z "$(rabbitmqctl cluster_status | grep rabbitmq-0)" ]]; then
                  rabbitmqctl stop_app;
                  rabbitmqctl join_cluster rabbit@rabbitmq-0;
                  rabbitmqctl start_app;
                fi; rabbitmqctl set_policy ha-all "." '{"ha-mode":"exactly","ha-params":3,"ha-sync-mode":"automatic"}'
        name: rabbitmq
        ports:
        - containerPort: 5672
          protocol: TCP
        - containerPort: 5671
          protocol: TCP
        - containerPort: 15672
          protocol: TCP
        - containerPort: 25672
          protocol: TCP
        - containerPort: 4369
          protocol: TCP
        resources:
          limits:
            cpu: 400m
            memory: 2Gi
          requests:
            cpu: 200m
            memory: 1Gi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib/rabbitmq
          name: rabbitmq-persistent-data-storage
        - mountPath: /etc/rabbitmq
          name: rabbitmq-config
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 10
      volumes:
      - name: rabbitmq-config
        secret:
          defaultMode: 420
          secretName: rabbitmq-config
  updateStrategy:
    type: OnDelete
  volumeClaimTemplates:
  - metadata:
      creationTimestamp: null
      name: rabbitmq-persistent-data-storage
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 100Gi
    status:
      phase: Pending
status:
  currentReplicas: 1
  currentRevision: rabbitmq-4234207235
  observedGeneration: 1
  replicas: 1
  updateRevision: rabbitmq-4234207235
$kubectl获取服务rabbitmq

apiVersion: v1
kind: Service
metadata:
  labels:
    app: rabbitmq
  name: rabbitmq
  namespace: develop
  resourceVersion: "59968950"
  selfLink: /api/v1/namespaces/develop/services/rabbitmq
  uid: ced85a60-cbae-11e7-8821-00505695519e
spec:
  clusterIP: None
  ports:
  - name: tls-amqp
    port: 5671
    protocol: TCP
    targetPort: 5671
  - name: management
    port: 15672
    protocol: TCP
    targetPort: 15672
  selector:
    app: rabbitmq
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}    
$kubectl rabbitmq-0

Name:           rabbitmq-0
Namespace:      development
Node:           node9/170.XX.X.Xx
Labels:         app=rabbitmq
                controller-revision-hash=rabbitmq-4234207235
Status:         Running
IP:             10.25.128.XX
Controlled By:  StatefulSet/rabbitmq
Containers:
  rabbitmq:
    Container ID:   docker://f60b06283d3974382a068ded54782b24de4b6da3203c05772a77c65d76aa2e2f
    Image:          rabbitmq:1.0
    Image ID:       rabbitmq@sha256:6245a81a1fc0fb
    Ports:          5672/TCP, 5671/TCP, 15672/TCP, 25672/TCP, 4369/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
    Ready:          False
    Restart Count:  104
    Limits:
      cpu:     400m
      memory:  2Gi
    Requests:
      cpu:     200m
      memory:  1Gi
    Environment:
      RABBITMQ_ERLANG_COOKIE:  <set to the key 'rabbitmq-erlang-cookie' in secret 'rabbitmq-erlang-cookie'>  Optional: false
    Mounts:
      /etc/rabbitmq from rabbitmq-config (rw)
      /var/lib/rabbitmq from rabbitmq-persistent-data-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-lqbp6 (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  rabbitmq-persistent-data-storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  rabbitmq-persistent-data-storage-rabbitmq-0
    ReadOnly:   false
  rabbitmq-config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  rabbitmq-config
    Optional:    false
  default-token-lqbp6:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-lqbp6
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     <none>
Events:          <none>
名称:rabbitmq-0
名称空间:开发
节点:node9/170.XX.X.XX
标签:app=rabbitmq
控制器修订哈希=rabbitmq-4234207235
状态:正在运行
IP:10.25.128.XX
控制者:StatefulSet/rabbitmq
容器:
rabbitmq:
容器ID:docker://f60b06283d3974382a068ded54782b24de4b6da3203c05772a77c65d76aa2e2f
图像:rabbitmq:1.0
图像ID:rabbitmq@sha256:6245a81a1fc0fb
端口:5672/TCP、5671/TCP、15672/TCP、25672/TCP、4399/TCP
国家:等待
原因:仓促退却
最后状态:终止
原因:已完成
退出代码:0
就绪:错误
重新启动计数:104
限制:
中央处理器:400米
内存:2Gi
请求:
中央处理器:200米
内存:1Gi
环境:
RABBITMQ_ERLANG_COOKIE:可选:false
挂载:
/来自rabbitmq配置(rw)的etc/rabbitmq
/来自rabbitmq持久数据存储(rw)的var/lib/rabbitmq
/来自default-token-lqbp6(ro)的var/run/secrets/kubernetes.io/serviceCount
条件:
类型状态
初始化为True
准备错误
播客预定为真
卷数:
rabbitmq持久数据存储:
类型:PersistentVolumeClaim(对同一命名空间中PersistentVolumeClaim的引用)
索赔名称:rabbitmq-persistent-data-storage-rabbitmq-0
只读:false
rabbitmq配置:
类型:Secret(由Secret填充的卷)
SecretName:rabbitmq配置
可选:false
default-token-lqbp6:
类型:Secret(由Secret填充的卷)
SecretName:default-token-lqbp6
可选:false
QoS等级:Burstable
节点选择器:
容忍:
活动:

此问题是由于Pod内部发生DNS解析失败造成的。由于没有有效的DNS记录,POD无法相互联系

为了解决此问题,请尝试创建其他服务,或编辑现有服务以处理此服务的DNS解析

为DNS probe创建附加服务,可按如下方式进行:

种类:服务
版本:v1
元数据:
名称空间:默认值
名称:rabbitmq
标签:
应用程序:rabbitmq
类型:服务
规格:
端口:
-名称:http
协议:TCP
港口:15672
目标港:15672
-姓名:amqp
协议:TCP
港口:5672
目标港:5672
选择器:
应用程序:rabbitmq
类型:集群
clusterIP:无

这里您在服务规范中提到它是ClusterIP类型,ClusterIP为none。这将有助于POD解析DNS

干杯


Rishabh

你设置了erlang cookies吗?@RomanRabinovich是的,我设置了erlang cookies为什么不试试你关注的博客文章中的链接,rabbitMQ有官方的k8s插件用于对等发现,如文章中所述。我想,您的问题可能是因为rbac策略需要查找其他用户。