Python 2.7 从Apache 1.10.2中的BigQueryOperator继承时,参数设置为None

Python 2.7 从Apache 1.10.2中的BigQueryOperator继承时,参数设置为None,python-2.7,airflow,Python 2.7,Airflow,在python 2.7中,我正在从1.9.0升级到Airflow 1.10.2,我在Airflow/contrib/operators/bigquery_operator.py方面遇到了问题,更确切地说,是因为不赞成使用sql而使用bql参数 我有一个基于BigQueryOperator的类层次结构 类BigQueryFromexternalSqlOperator或BigQueryOperator: template_fields=BigQueryOperator.template_fields

在python 2.7中,我正在从1.9.0升级到Airflow 1.10.2,我在Airflow/contrib/operators/bigquery_operator.py方面遇到了问题,更确切地说,是因为不赞成使用sql而使用bql参数

我有一个基于BigQueryOperator的类层次结构

类BigQueryFromexternalSqlOperator或BigQueryOperator: template_fields=BigQueryOperator.template_fields+'get_sql_kwargs', def___init_self,get_sql_func,get_sql_kwargs={},*args,**kwargs: superBigQueryFromExternalSqlOperator,self.\uuuu init\uuuu bql=,/!\问题参数 *args, **夸尔斯 self.get\u sql\u func=get\u sql\u func self.get\u sql\u kwargs=get\u sql\u kwargs def get_sqlself: 返回self.get\u sql\u func**self.get\u sql\u kwargs def pre_executeself,上下文: self.bql=self.get\u sql 类BigQueryPartitionTableOperator BigQueryFromexternalSqlOperator: template_fields='get_schema_kwargs',+BigQueryFromExternalSqlOperator.template_fields 模板_ext='..sql', def_uuuinit_uuuself,get_schema_func,get_schema_kwargs={},*args,**kwargs: superBigQueryToPartitionTableOperator,self.\uuuu init\uuuu*args,**kwargs self.hook=BigQueryTableHookbigquery\u conn\u id=self.bigquery\u conn\u id, delegate\u to=self.delegate\u to self.get\u schema\u func=get\u schema\u func self.get_schema_kwargs=get_schema_kwargs self.schema=None 我的一个DAG正在使用BigQueryToPartitionTableOperator。 当我做一个气流列表以得到可解析的气流时,我得到的是

Traceback (most recent call last):
  File "/usr/local/lib/airflow/airflow/models.py", line 374, in process_file
    m = imp.load_source(mod_name, filepath)
  File "/home/airflow/gcs/dags/processing/dags/learning/clustering_activity/dag.py", line 37, in <module>
    "period": Variable.get("activity_clustering.period")
  File "/home/airflow/gcs/dags/processing/common/dags/inference_dag.py", line 215, in __enter__
    dataset_partitioned=self.dataset,
  File "/home/airflow/gcs/dags/processing/common/operators/big_query_operator.py", line 79, in __init__
    super(BigQueryShardedToPartitionedOperator, self).__init__(bql=None, *args, **kwargs)
  File "/usr/local/lib/airflow/airflow/utils/decorators.py", line 97, in wrapper
    result = func(*args, **kwargs)
  File "/usr/local/lib/airflow/airflow/contrib/operators/bigquery_operator.py", line 176, in __init__
    'argument: `sql`'.format(self.task_id))
TypeError: inferred_to_partitioned missing 1 required positional argument: `sql`
虽然我在BigQueryFromExternalSqlOperator中为bql、bql=设置了一个默认值,但仍然得到了与上面相同的异常

我不知道这是否与python中实例化对象时的继承和默认参数有关

或者,中的apply_defaults decorator正在更改传递给BigQueryOperator的uu init_u函数的参数

编辑1: 这是我给接线员打电话的方式

class myDAG(DAG):

...
    def __enter__():
        ...
        # Save the input dataset in version-suffixed table in BQ
        extract_dataset = BigQueryToPartitionTableOperator(task_id='extract_dataset',
                                                           get_sql_func=self.get_sql,
                                                           get_schema_func=self.get_schema,
                                                           get_sql_kwargs=self.get_extract_dataset_sql_kwargs,
                                                           get_schema_kwargs=self.get_extracted_table_schema_kwargs,
                                                           destination_dataset_table='{}.{}'.format(
                                                               self.dataset,
                                                               self.extracted_table),
                                                           write_disposition='WRITE_TRUNCATE',
                                                           use_legacy_sql=False,
                                                           bigquery_conn_id=self.gcp_conn_id)


感谢您添加代码片段。如果我理解正确,那么您没有传递sql参数来说明错误消息TypeError:推断的\u到\u分区缺少1个必需的位置参数:sql

尝试以这种方式修复它:

将sql属性传递给您的父BigQueryPertor,该属性不为空,仅用于调试 如果在此之后“missing 1 required position Arguments:sql”错误消失,请找到将查询传递给BigQueryOperator sql参数的方法,或者,如果不想将查询执行委托给BigQueryOperator sql参数,请重写执行该参数的方法。但是如果您不需要执行BigQueryOperator,那么去掉这个父级会更简单。
您好,您可以演示如何在DAG中使用BigQueryToPartitionTable运算符吗?您可以用“xxxx”替换任何敏感信息。@bartosz25我添加了使用BigQueryPartitionTableOperator的DAG代码部分。我想知道您是否成功调试了该问题?尤其是堆栈跟踪中提到了SuperBigQuerySharedToPartitionedOperator,self.\uu init\uuu bql=None,*args,**kwargs,它未在基于BigQueryOperator的初始类层次结构中列出。@bartosz25感谢您的帮助。我终于找到了问题所在。我看错了线索。我更改了右边的行:SuperBigQuerySharedToPartitionedOperator,self.\uu init\uu bql=,*args,**kwargs不是superBigQueryFromExternalSqlOperator,self中的那一行。\uu init\uu bql=,*args,**kwargs,如上所示。再次感谢您,您之前的评论真的很有帮助:bql=in BigQueryFromExternalSqlOperator解决了后续问题!:
class myDAG(DAG):

...
    def __enter__():
        ...
        # Save the input dataset in version-suffixed table in BQ
        extract_dataset = BigQueryToPartitionTableOperator(task_id='extract_dataset',
                                                           get_sql_func=self.get_sql,
                                                           get_schema_func=self.get_schema,
                                                           get_sql_kwargs=self.get_extract_dataset_sql_kwargs,
                                                           get_schema_kwargs=self.get_extracted_table_schema_kwargs,
                                                           destination_dataset_table='{}.{}'.format(
                                                               self.dataset,
                                                               self.extracted_table),
                                                           write_disposition='WRITE_TRUNCATE',
                                                           use_legacy_sql=False,
                                                           bigquery_conn_id=self.gcp_conn_id)

class BigQueryFromExternalSqlOperator(BigQueryOperator):
    template_fields = BigQueryOperator.template_fields + ('get_sql_kwargs',)

    def __init__(self, get_sql_func, get_sql_kwargs={}, *args, **kwargs):

        super(BigQueryFromExternalSqlOperator, self).__init__(sql = 'SELECT ....',
                                                              *args,
                                                              **kwargs)