Python 气流-无法导入Spark提供程序-包:名称';客户';没有定义
我已在现有Airflow 2.0.0安装的基础上安装了Apache Spark provider,包括:Python 气流-无法导入Spark提供程序-包:名称';客户';没有定义,python,apache-spark,pip,airflow,Python,Apache Spark,Pip,Airflow,我已在现有Airflow 2.0.0安装的基础上安装了Apache Spark provider,包括: pip安装apache-spark提供程序apache-spark 启动Web服务器时,无法导入提供程序: [2021-01-19 18:49:46,871] {providers_manager.py:279} WARNING - Exception when importing 'airflow.providers.apache.spark.hooks.spark_jdbc.SparkJ
pip安装apache-spark提供程序apache-spark
启动Web服务器时,无法导入提供程序:
[2021-01-19 18:49:46,871] {providers_manager.py:279} WARNING - Exception when importing 'airflow.providers.apache.spark.hooks.spark_jdbc.SparkJDBCHook' from 'apache-airflow-providers-apache-spark' package: name 'client' is not defined
[2021-01-19 18:49:46,873] {providers_manager.py:279} WARNING - Exception when importing 'airflow.providers.apache.spark.hooks.spark_submit.SparkSubmitHook' from 'apache-airflow-providers-apache-spark' package: name 'client' is not defined
[2021-01-19 18:49:46,941] {providers_manager.py:279} WARNING - Exception when importing 'airflow.providers.apache.spark.hooks.spark_jdbc.SparkJDBCHook' from 'apache-airflow-providers-apache-spark' package: name 'client' is not defined
[2021-01-19 18:49:46,942] {providers_manager.py:279} WARNING - Exception when importing 'airflow.providers.apache.spark.hooks.spark_submit.SparkSubmitHook' from 'apache-airflow-providers-apache-spark' package: name 'client' is not defined
你知道怎么克服这个吗。
为了记录在案,我在Ubuntu 20.04上安装了Python3.8.5和PIP20.0.2
谢谢 解决方案
我安装了一个
可能原因
在spark_submit模块中捕获异常似乎有问题。在aiffiration/providers/apache/spark/hooks/spark_submit.py
中,除了ImportError
之外,如果当前未安装kubernetes客户端,将引发一个异常
try:
from airflow.kubernetes import kube_client
except ImportError:
pass
尽管如此,导致错误的异常是namererror
,因为没有定义client
变量。在气流计划程序日志中找到:
Traceback (most recent call last):
airflow-scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/models/dagbag.py", line 302, in _load_modules_from_file
airflow-scheduler | loader.exec_module(new_module)
airflow-scheduler | File "<frozen importlib._bootstrap_external>", line 783, in exec_module
airflow-scheduler | File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
airflow-scheduler | File "/opt/airflow/dags/spark_dag.py", line 4, in <module>
airflow-scheduler | from airflow.providers.apache.spark.operators.spark_submit import SparkSubmitOperator
airflow-scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/providers/apache/spark/operators/spark_submit.py", line 22, in <module>
airflow-scheduler | from airflow.providers.apache.spark.hooks.spark_submit import SparkSubmitHook
airflow-scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/providers/apache/spark/hooks/spark_submit.py", line 32, in <module>
airflow-scheduler | from airflow.kubernetes import kube_client
airflow-scheduler | File "/usr/local/lib/python3.8/site-packages/airflow/kubernetes/kube_client.py", line 101, in <module>
airflow-scheduler | ) -> client.CoreV1Api:
airflow-scheduler | NameError: name 'client' is not defined
回溯(最近一次呼叫最后一次):
气流调度器|文件“/usr/local/lib/python3.8/site packages/afflow/models/dagbag.py”,第302行,在_load_modules_from_文件中
气流调度器|加载器.exec_模块(新_模块)
气流调度器|文件“”,第783行,在exec|U模块中
气流调度器|文件“”,第219行,在_call _中删除_帧
气流调度器|文件“/opt/aiffair/dags/spark_dag.py”,第4行,in
气流调度器|来自afflow.providers.apache.spark.operators.spark|提交导入SparkSubmitOperator
airflow scheduler | File“/usr/local/lib/python3.8/site packages/afflow/providers/apache/spark/operators/spark_submit.py”,第22行
气流调度器|来自aiffort.providers.apache.spark.hooks.spark|提交导入SparkSubmitHook
airflow scheduler | File“/usr/local/lib/python3.8/site packages/afflow/providers/apache/spark/hooks/spark_submit.py”,第32行,在
气流调度器|来自airflow.kubernetes导入kube_客户端
气流调度器|文件“/usr/local/lib/python3.8/site packages/afflow/kubernetes/kube_client.py”,第101行,在
气流调度器|)->client.CoreV1Api:
气流计划程序|名称错误:未定义名称“客户端”
最肯定的是,它与导入有关:从airflow.kubernetes导入kube_客户即使我安装了“apache airflow providers cncf kubernetes”pip包