Kubernetes-处理无限数量的工作项

Kubernetes-处理无限数量的工作项,kubernetes,Kubernetes,我需要从工作队列中获取一个工作项,然后依次运行一系列容器来处理每个工作项。这可以使用initContainers()完成 重新启动流程以获取下一个工作项的建议方法是什么 工作似乎很理想,但似乎不支持无限/无限数量的完成 使用单个Pod不起作用,因为initContainers没有重新启动() 我更愿意避免argo或BATCH等系统的维护/学习开销 谢谢 这种情况下最简单的方法是使用CronJobCronJob根据计划运行Jobs。有关更多信息,请浏览 下面是一个示例(我从中获取并修改了它)

我需要从工作队列中获取一个工作项,然后依次运行一系列容器来处理每个工作项。这可以使用initContainers()完成

重新启动流程以获取下一个工作项的建议方法是什么

  • 工作似乎很理想,但似乎不支持无限/无限数量的完成
  • 使用单个Pod不起作用,因为initContainers没有重新启动()
  • 我更愿意避免argo或BATCH等系统的维护/学习开销

谢谢

这种情况下最简单的方法是使用
CronJob
CronJob
根据计划运行
Jobs
。有关更多信息,请浏览

下面是一个示例(我从中获取并修改了它)

然而,他的解决方案有一些局限性:

  • 它的运行频率不能超过1分钟
  • 如果需要逐个处理工作项,则需要在
    InitContainer
  • CronJobs仅在Kubernetes 1.8及更高版本中可用
作业应用于处理工作队列。使用工作队列时,不应设置(或将其设置为
null
)。在这种情况下,将继续创建吊舱,直到其中一个吊舱成功退出。从(主)容器中退出时故意出现故障状态有点尴尬,但这是规范。无论此设置如何,您都可以根据自己的喜好设置
.spec.parallelism
;我已将其设置为
1
,因为您似乎不需要任何并行性

在您的问题中,您没有指定在工作队列变空时要执行的操作,因此我将给出两种解决方案,一种是等待新项目(无限),另一种是在工作队列变空时结束作业(项目数量有限但不定)

这两个示例都使用redis,但您可以将此模式应用于您最喜欢的队列。请注意,从队列中弹出项目的部分不安全;如果你的Pod在弹出一个项目后由于某种原因死亡,该项目将保持未处理或未完全处理。有关正确的解决方案,请参阅

在我使用的每个工作项上实现顺序步骤。请注意,这确实是一个基本的解决方案,但是如果您不想使用某些框架来实现适当的管道,那么您的选择是有限的

如果有人希望在不部署redis等的情况下看到这一点,则会出现问题。

雷迪斯 要测试这一点,您至少需要创建一个redis吊舱和一个服务。我使用的例子来自。您可以通过以下方式进行部署:

kubectl apply -f https://rawgit.com/kubernetes/website/master/docs/tasks/job/fine-parallel-processing-work-queue/redis-pod.yaml
kubectl apply -f https://rawgit.com/kubernetes/website/master/docs/tasks/job/fine-parallel-processing-work-queue/redis-service.yaml
此解决方案的其余部分希望您的服务名
redis
与作业位于同一命名空间中,并且不需要身份验证和名为
redis master
的Pod

插入项目 要在工作队列中插入某些项,请使用以下命令(需要bash才能工作):

无限版本 如果队列为空,此版本将等待,因此它将永远不会完成

apiVersion: batch/v1
kind: Job
metadata:
  name: primitive-pipeline-infinite
spec:
  parallelism: 1
  completions: null
  template:
    metadata:
      name: primitive-pipeline-infinite
    spec:
      volumes: [{name: shared, emptyDir: {}}]
      initContainers:
      - name: pop-from-queue-unsafe
        image: redis
        command: ["sh","-c","redis-cli -h redis blpop job 0 >/shared/item.txt"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-1
        image: busybox
        command: ["sh","-c","echo step-1 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-2
        image: busybox
        command: ["sh","-c","echo step-2 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-3
        image: busybox
        command: ["sh","-c","echo step-3 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      containers:
      - name: done
        image: busybox
        command: ["sh","-c","echo all done with `cat /shared/item.txt`; sleep 1; exit 1"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      restartPolicy: Never
有限版本 如果队列为空,此版本将停止作业。请注意,pop init容器检查队列是否为空,如果队列确实为空,则所有后续init容器主容器立即退出-这是向Kubernetes发出作业已完成且无需为其创建新POD的信号的机制

apiVersion: batch/v1
kind: Job
metadata:
  name: primitive-pipeline-finite
spec:
  parallelism: 1
  completions: null
  template:
    metadata:
      name: primitive-pipeline-finite
    spec:
      volumes: [{name: shared, emptyDir: {}}]
      initContainers:
      - name: pop-from-queue-unsafe
        image: redis
        command: ["sh","-c","redis-cli -h redis lpop job >/shared/item.txt; grep -q . /shared/item.txt || :>/shared/done.txt"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-1
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-1 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-2
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-2 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-3
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-3 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      containers:
      - name: done
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo all done with `cat /shared/item.txt`; sleep 1; exit 1"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      restartPolicy: Never

谢谢,这还不算太糟,可以设置.spec.concurrency策略禁止。1分钟的限制并不理想,谢谢你的回答。我追求的是无限版本,在无限版本中等待新项目-这种将restartPolicy设置为Never然后使pod失败的方法对我来说是新的-但是由此产生的错误状态不是很好,因为它们会干扰监控并使查看实际错误变得更加困难..是的,我不喜欢这种行为。您希望一分钟处理多少项?我问这个问题的原因是,如果一分钟只说10个项目,你可以设置完成:100000000(10亿),这将足以维持190年(100000000/10/60/24/365)。然后您可以成功地从“完成”容器中退出。实际上,这些作业是手动生成的,因此最多每分钟1个。这基本上就是我最后使用的-只是一个普通作业。我已经将完成设置为10000,但很好的一点是,我可以更高,这是一个int32。这就足够了。这不支持基于消息队列度量的工作区的自动水平扩展,对吗?@connor.brinton,对;这专门用于顺序处理(参见问题)
apiVersion: batch/v1
kind: Job
metadata:
  name: primitive-pipeline-infinite
spec:
  parallelism: 1
  completions: null
  template:
    metadata:
      name: primitive-pipeline-infinite
    spec:
      volumes: [{name: shared, emptyDir: {}}]
      initContainers:
      - name: pop-from-queue-unsafe
        image: redis
        command: ["sh","-c","redis-cli -h redis blpop job 0 >/shared/item.txt"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-1
        image: busybox
        command: ["sh","-c","echo step-1 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-2
        image: busybox
        command: ["sh","-c","echo step-2 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-3
        image: busybox
        command: ["sh","-c","echo step-3 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      containers:
      - name: done
        image: busybox
        command: ["sh","-c","echo all done with `cat /shared/item.txt`; sleep 1; exit 1"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      restartPolicy: Never
apiVersion: batch/v1
kind: Job
metadata:
  name: primitive-pipeline-finite
spec:
  parallelism: 1
  completions: null
  template:
    metadata:
      name: primitive-pipeline-finite
    spec:
      volumes: [{name: shared, emptyDir: {}}]
      initContainers:
      - name: pop-from-queue-unsafe
        image: redis
        command: ["sh","-c","redis-cli -h redis lpop job >/shared/item.txt; grep -q . /shared/item.txt || :>/shared/done.txt"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-1
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-1 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-2
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-2 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-3
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-3 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      containers:
      - name: done
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo all done with `cat /shared/item.txt`; sleep 1; exit 1"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      restartPolicy: Never