Python Google Cloud Composer,airflow作业无法识别已安装的PyPi软件包

Python Google Cloud Composer,airflow作业无法识别已安装的PyPi软件包,python,google-cloud-platform,google-cloud-storage,google-cloud-composer,Python,Google Cloud Platform,Google Cloud Storage,Google Cloud Composer,我正在使用Google Cloud Composer研究气流。以下是dag文件: from airflow import DAG from airflow.operators.bash_operator import BashOperator from datetime import datetime, timedelta dag = DAG( 'hello_world', description='Simple DAG', start_date=datetime.no

我正在使用Google Cloud Composer研究气流。以下是dag文件:

from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from datetime import datetime, timedelta

dag = DAG(
    'hello_world',
    description='Simple DAG',
    start_date=datetime.now() - timedelta(days=1),
    schedule_interval='@once'
)

hello = BashOperator(
    task_id='hello_world',
    bash_command='python3 /home/airflow/gcs/dags/dependencies/helper.py',
    dag=dag
)
它基本上在文件夹
/dags/dependencies/
中运行
helper.py
,该文件夹位于Google Cloud Storage(DAG包目录)中

helper.py
包含以下代码:

from fastavro import writer
import io
import logging


def greetings():
    buffer = io.BytesIO()
    age = 24
    schema = {
        'doc': "cockroach",
        'name': "table",
        'namespace': "cockroach",
        'type': "record",
        'fields': [{'name': 'age', 'type': ['null', 'int']}]
    }
    writer(buffer, schema=schema, records=[{"age": 24}])
    logging.info("Hello {}".format(name))
    return "Hello {}".format(name)
它会引发一个错误
ModuleNotFoundError:没有名为'fastavro'的模块

[2019-01-11 04:01:57,388] {base_task_runner.py:98} INFO - Subtask: [2019-01-11 04:01:57,386] {bash_operator.py:101} INFO - Traceback (most recent call last):
[2019-01-11 04:01:57,389] {base_task_runner.py:98} INFO - Subtask: [2019-01-11 04:01:57,388] {bash_operator.py:101} INFO - File "/home/airflow/gcs/dags/dependencies/helper.py", line 1, in <module>
[2019-01-11 04:01:57,389] {base_task_runner.py:98} INFO - Subtask: [2019-01-11 04:01:57,388] {bash_operator.py:101} INFO - from fastavro import writer
[2019-01-11 04:01:57,390] {base_task_runner.py:98} INFO - Subtask: [2019-01-11 04:01:57,389] {bash_operator.py:101} INFO - ModuleNotFoundError: No module named 'fastavro'
[2019-01-11 04:01:58,154] {base_task_runner.py:98} INFO - Subtask: [2019-01-11 04:01:58,152] {bash_operator.py:105} INFO - Command exited with return code 1
[2019-01-11 04:01:58,214] {base_task_runner.py:98} INFO - Subtask: Traceback (most recent call last):
[2019-01-11 04:01:58,214] {base_task_runner.py:98} INFO - Subtask:   File "/usr/local/bin/airflow", line 6, in <module>
[2019-01-11 04:01:58,214] {base_task_runner.py:98} INFO - Subtask:     exec(compile(open(__file__).read(), __file__, 'exec'))
[2019-01-11 04:01:58,215] {base_task_runner.py:98} INFO - Subtask:   File "/usr/local/lib/airflow/airflow/bin/airflow", line 27, in <module>
[2019-01-11 04:01:58,215] {base_task_runner.py:98} INFO - Subtask:     args.func(args)
[2019-01-11 04:01:58,215] {base_task_runner.py:98} INFO - Subtask:   File "/usr/local/lib/airflow/airflow/bin/cli.py", line 392, in run
[2019-01-11 04:01:58,215] {base_task_runner.py:98} INFO - Subtask:     pool=args.pool,
[2019-01-11 04:01:58,215] {base_task_runner.py:98} INFO - Subtask:   File "/usr/local/lib/airflow/airflow/utils/db.py", line 50, in wrapper
[2019-01-11 04:01:58,216] {base_task_runner.py:98} INFO - Subtask:     result = func(*args, **kwargs)
[2019-01-11 04:01:58,216] {base_task_runner.py:98} INFO - Subtask:   File "/usr/local/lib/airflow/airflow/models.py", line 1492, in _run_raw_task
[2019-01-11 04:01:58,216] {base_task_runner.py:98} INFO - Subtask:     result = task_copy.execute(context=context)
[2019-01-11 04:01:58,219] {base_task_runner.py:98} INFO - Subtask:   File "/usr/local/lib/airflow/airflow/operators/bash_operator.py", line 109, in execute
[2019-01-11 04:01:58,219] {base_task_runner.py:98} INFO - Subtask:     raise AirflowException("Bash command failed")
[2019-01-11 04:01:58,220] {base_task_runner.py:98} INFO - Subtask: airflow.exceptions.AirflowException: Bash command failed
[2019-01-11 04:01:57388]{base_task_runner.py:98}信息-子任务:[2019-01-11 04:01:57386]{bash_operator.py:101}信息-回溯(最后一次最近调用):
[2019-01-11 04:01:57389]{base_task_runner.py:98}信息-子任务:[2019-01-11 04:01:57388]{bash_operator.py:101}信息-文件“/home/aiffair/gcs/dags/dependencies/helper.py”,第1行,在
[2019-01-11 04:01:57389]{base_task_runner.py:98}信息-子任务:[2019-01-11 04:01:57388]{bash_operator.py:101}信息-来自fastavro导入编写器
[2019-01-11 04:01:57390]{base_task_runner.py:98}信息-子任务:[2019-01-11 04:01:57389]{bash_operator.py:101}信息-模块通知错误:没有名为'fastavro'的模块
[2019-01-11 04:01:58154]{base_task_runner.py:98}信息-子任务:[2019-01-11 04:01:58152]{bash_operator.py:105}信息-命令退出,返回代码为1
[2019-01-11 04:01:58214]{base_task_runner.py:98}信息-子任务:回溯(最近一次调用last):
[2019-01-11 04:01:58214]{base_task_runner.py:98}信息-子任务:文件“/usr/local/bin/aiffair”,第6行,在
[2019-01-11 04:01:58214]{base_task_runner.py:98}信息-子任务:exec(编译(打开(uu file_u).read(),u file_u,'exec'))
[2019-01-11 04:01:58215]{base_task_runner.py:98}信息-子任务:文件“/usr/local/lib/aiffair/aiffair/bin/aiffair”,第27行,in
[2019-01-1104:01:58215]{base_task_runner.py:98}信息-子任务:args.func(args)
[2019-01-11 04:01:58215]{base_task_runner.py:98}信息-子任务:文件“/usr/local/lib/aiffair/aiffair/bin/cli.py”,第392行,运行中
[2019-01-1104:01:58215]{base_task_runner.py:98}INFO-子任务:pool=args.pool,
[2019-01-11 04:01:58215]{base_task_runner.py:98}信息-子任务:文件“/usr/local/lib/aiffair/aiffair/utils/db.py”,第50行,在包装器中
[2019-01-1104:01:58216]{base_task_runner.py:98}INFO-子任务:result=func(*args,**kwargs)
[2019-01-11 04:01:58216]{base_task_runner.py:98}信息-子任务:文件“/usr/local/lib/aiffair/aiffair/models.py”,第1492行,在_run_raw_task中
[2019-01-11 04:01:58216]{base_task_runner.py:98}INFO-子任务:result=task_copy.execute(context=context)
[2019-01-11 04:01:58219]{base_task_runner.py:98}信息-子任务:文件“/usr/local/lib/aiffair/aiffair/operators/bash_operator.py”,执行中的第109行
[2019-01-11 04:01:58219]{base_task_runner.py:98}信息-子任务:引发空气流量异常(“Bash命令失败”)
[2019-01-11 04:01:58220]{base_task_runner.py:98}信息-子任务:aiffort.exceptions.aifflowexception:Bash命令失败
尽管如此,我还是在Google Composer环境的PyPi包中安装了
fastavro


有人知道怎么解决吗?

我已经解决了。这主要是因为与python2(googlecloudcomposer的默认版本)的版本冲突。 因此,我用python3环境重新创建了一个新的googlecloudcomposer环境(因为一旦创建了一个环境,就不可能更改它的Python版本:)。 它解决了这个问题