Python 气流KubernetesPodOperator在本地MicroK8s上超时

Python 气流KubernetesPodOperator在本地MicroK8s上超时,python,docker,kubernetes,airflow,microk8s,Python,Docker,Kubernetes,Airflow,Microk8s,我正试图用KubernetesPodOperator旋转一个测试舱。作为一个图像,我使用Docker的hello world示例,我将其推送到MicroK8s安装的本地注册表中 from airflow import DAG from airflow.operators.dummy_operator import DummyOperator from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOpe

我正试图用KubernetesPodOperator旋转一个测试舱。作为一个图像,我使用Docker的hello world示例,我将其推送到MicroK8s安装的本地注册表中

from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
from airflow.kubernetes.pod import Port
from airflow.utils.dates import days_ago
from datetime import timedelta

ports = [Port('http', 80)]

default_args = {
    'owner': 'user',
    'start_date': days_ago(5),
    'email': ['user@mail'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 0
}

workflow = DAG(
    'kubernetes_helloworld',
    default_args=default_args,
    description='Our first DAG',
    schedule_interval=None,
)

op = DummyOperator(task_id='dummy', dag=workflow)

t1 = KubernetesPodOperator(
    dag=workflow,
    namespace='default',
    image='localhost:32000/hello-world:registry',
    name='pod2',
    task_id='pod2',
    is_delete_operator_pod=True,
    hostnetwork=False,
    get_logs=True,
    do_xcom_push=False,
    in_cluster=False,
    ports=ports,
    )

op >> t1
当我触发DAG时,它会继续运行,并重新尝试无限次发射吊舱。 这是我在气流中得到的日志输出:

Reading local file: /home/user/airflow/logs/kubernetes_helloworld/pod2/2021-03-17T16:25:11.142695+00:00/4.log
[2021-03-17 16:30:00,315] {taskinstance.py:851} INFO - Dependencies all met for <TaskInstance: kubernetes_helloworld.pod2 2021-03-17T16:25:11.142695+00:00 [queued]>
[2021-03-17 16:30:00,319] {taskinstance.py:851} INFO - Dependencies all met for <TaskInstance: kubernetes_helloworld.pod2 2021-03-17T16:25:11.142695+00:00 [queued]>
[2021-03-17 16:30:00,319] {taskinstance.py:1042} INFO - 
--------------------------------------------------------------------------------
[2021-03-17 16:30:00,320] {taskinstance.py:1043} INFO - Starting attempt 4 of 1
[2021-03-17 16:30:00,320] {taskinstance.py:1044} INFO - 
--------------------------------------------------------------------------------
[2021-03-17 16:30:00,330] {taskinstance.py:1063} INFO - Executing <Task(KubernetesPodOperator): pod2> on 2021-03-17T16:25:11.142695+00:00
[2021-03-17 16:30:00,332] {standard_task_runner.py:52} INFO - Started process 9021 to run task
[2021-03-17 16:30:00,335] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'kubernetes_helloworld', 'pod2', '2021-03-17T16:25:11.142695+00:00', '--job-id', '57', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/kubernetes_helloworld.py', '--cfg-path', '/tmp/tmp5ss4g6q4', '--error-file', '/tmp/tmp9t3l8emt']
[2021-03-17 16:30:00,336] {standard_task_runner.py:77} INFO - Job 57: Subtask pod2
[2021-03-17 16:30:00,357] {logging_mixin.py:104} INFO - Running <TaskInstance: kubernetes_helloworld.pod2 2021-03-17T16:25:11.142695+00:00 [running]> on host 05nclorenzvm01.internal.cloudapp.net
[2021-03-17 16:30:00,369] {taskinstance.py:1255} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_EMAIL=user
AIRFLOW_CTX_DAG_OWNER=user
AIRFLOW_CTX_DAG_ID=kubernetes_helloworld
AIRFLOW_CTX_TASK_ID=pod2
AIRFLOW_CTX_EXECUTION_DATE=2021-03-17T16:25:11.142695+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2021-03-17T16:25:11.142695+00:00
[2021-03-17 16:32:09,805] {connectionpool.py:751} WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f812fc23eb0>: Failed to establish a new connection: [Errno 110] Connection timed out')': /api/v1/namespaces/default/pods?labelSelector=dag_id%3Dkubernetes_helloworld%2Cexecution_date%3D2021-03-17T162511.1426950000-e549b02ea%2Ctask_id%3Dpod2
读取本地文件:/home/user/afflow/logs/kubernetes_helloworld/pod2/2021-03-17T16:25:11.142695+00:00/4.log
[2021-03-17 16:30:00315]{taskinstance.py:851}信息-所有依赖项都满足以下条件:
[2021-03-17 16:30:00319]{taskinstance.py:851}信息-所有依赖项都满足以下条件:
[2021-03-17 16:30:00319]{taskinstance.py:1042}信息-
--------------------------------------------------------------------------------
[2021-03-17 16:30:00320]{taskinstance.py:1043}信息-开始尝试1中的第4次
[2021-03-17 16:30:00320]{taskinstance.py:1044}信息-
--------------------------------------------------------------------------------
[2021-03-17 16:30:00330]{taskinstance.py:1063}INFO-在2021-03-17T16:25:11.142695+00:00执行
[2021-03-17 16:30:00332]{standard_task_runner.py:52}INFO-启动进程9021以运行任务
[2021-03-17 16:30:00335]{standard_task_runner.py:76}运行信息:['aiffort','tasks','run','kubernetes_helloworld','pod2','2021-03-17T16:25:11.142695+00:00','--作业id','57','--池','default_池','--原始','--子目录','DAGS_文件夹/kubernetes_helloworld.py--'cfg路径','/tmp/tmp5ss6q4','error file/tml9t']
[2021-03-17 16:30:00336]{standard_task_runner.py:77}信息-作业57:子任务pod2
[2021-03-17 16:30:00357]{logging_mixin.py:104}信息-在主机05nclorenzvm01.internal.cloudapp.net上运行
[2021-03-17 16:30:00369]{taskinstance.py:1255}信息-导出以下环境变量:
气流\u CTX\u DAG\u电子邮件=用户
气流\u CTX\u DAG\u所有者=用户
空气流通量=kubernetes\u helloworld
气流\u CTX\u任务\u ID=pod2
空气流量执行日期=2021-03-17T16:25:11.142695+00:00
气流(CTX)DAG运行ID=手动(2021-03-17T16:25:11.142695+00:00)
[2021-03-17 16:32:09805]{connectionpool.py:751}警告-连接被“NewConnectionError”(“:未能建立新连接:[Errno 110]连接超时”)中断后重试(重试(总计=2,连接=None,读取=None,重定向=None,状态=None))':/api/v1/namespaces/default/pods?labelSelector=dag_id%3Dkubernetes_helloworld%2Cexecution_date%3D2021-03-17T162511.1426950000-e549b02ea%2Ctask_id%3Dpod2
当我在kubernetes发射吊舱时,没有气流,它运行良好。 我做错了什么

我试过以下方法:

  • 使用sleep命令防止容器退出
  • 尝试不同的图像,例如pyspark
  • 重新安装气流和MicroK8s
气流v2.0.1 MicroK8s v1.3.7 Python 3.8
Ubuntu 18.04 LTS

为了回答您的问题,我假设您在没有VM的本地microk8s集群上运行该任务

气流可能无法连接到K8s控制平面以触发pod。 添加
cluster\u context=“microk8s”

要查看所使用的集群上下文,请键入以下命令并将输出重定向到配置文件(在项目中):

输出:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0.............
    server: https://127.0.0.1:16443
  name: microk8s-cluster
contexts:
- context:
    cluster: microk8s-cluster
    user: admin
  name: microk8s
current-context: microk8s
kind: Config
preferences: {}
users:
- name: admin
  user:
     token: SldHNFQ3ek9yUGh4TVhWN......................................

不幸的是,我还没有解决MicroK8的问题

但是我能够在气流中使用KubernetesPodOperator和minikube。 以下代码能够在没有任何问题的情况下运行:

from airflow import DAG
from datetime import datetime, timedelta
from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
from airflow import configuration as conf
from airflow.utils.dates import days_ago

default_args = {
    'owner': 'user',
    'start_date': days_ago(5),
    'email': ['user@airflow.de'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 0
}

namespace = conf.get('kubernetes', 'NAMESPACE')

if namespace =='default':
    config_file = '/home/user/.kube/config'
    in_cluster=False
else:
    in_cluster=True
    config_file=None

dag = DAG('example_kubernetes_pod',
          schedule_interval='@once',
          default_args=default_args)

with dag:
    k = KubernetesPodOperator(
        namespace=namespace,
        image="hello-world",
        labels={"foo": "bar"},
        name="airflow-test-pod",
        task_id="task-one",
        in_cluster=in_cluster, # if set to true, will look in the cluster, if false, looks for file
        cluster_context='minikube', # is ignored when in_cluster is set to True
        config_file=config_file,
        is_delete_operator_pod=True,
        get_logs=True)

你激活了microk8s dns插件吗?哪个版本?你使用天文学家+气流吗?我没有启用dns插件…有必要吗?我也不是在用天文学家。所有版本都在我的帖子末尾。dns部署CoreDNS为Kubernetes提供地址解析服务。其他插件通常需要此服务,因此建议您启用它。谢谢你的回答,阿迪尔。我只是在添加了cluster_context属性的情况下尝试了一下,但不幸的是,这并没有什么不同。我仍然收到“无法建立新连接:[Errno 110]连接超时”异常。很明显,气流无法接触microK8s。你有没有其他办法来解决这个问题?我得到的错误和这篇文章中的一样:不幸的是,我不知道如何在气流的环境中应用修复。你有什么想法吗?只要我不知道你的气流装置,我就不容易知道你的问题。我强烈建议你通过astronomer安装气流。我更新了答案,添加了配置文件。我尝试了与astronomer在本文中的示例完全相同的示例:它也引用了配置文件,与你的答案相同,但没有什么区别。我可能会尝试与天文学家或Docker内部重新安装气流。
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0.............
    server: https://127.0.0.1:16443
  name: microk8s-cluster
contexts:
- context:
    cluster: microk8s-cluster
    user: admin
  name: microk8s
current-context: microk8s
kind: Config
preferences: {}
users:
- name: admin
  user:
     token: SldHNFQ3ek9yUGh4TVhWN......................................
from airflow import DAG
from datetime import datetime, timedelta
from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
from airflow import configuration as conf
from airflow.utils.dates import days_ago

default_args = {
    'owner': 'user',
    'start_date': days_ago(5),
    'email': ['user@airflow.de'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 0
}

namespace = conf.get('kubernetes', 'NAMESPACE')

if namespace =='default':
    config_file = '/home/user/.kube/config'
    in_cluster=False
else:
    in_cluster=True
    config_file=None

dag = DAG('example_kubernetes_pod',
          schedule_interval='@once',
          default_args=default_args)

with dag:
    k = KubernetesPodOperator(
        namespace=namespace,
        image="hello-world",
        labels={"foo": "bar"},
        name="airflow-test-pod",
        task_id="task-one",
        in_cluster=in_cluster, # if set to true, will look in the cluster, if false, looks for file
        cluster_context='minikube', # is ignored when in_cluster is set to True
        config_file=config_file,
        is_delete_operator_pod=True,
        get_logs=True)