Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/url/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Airflow 气流远程文件传感器_Airflow_Airflow Scheduler_Apache Airflow Xcom - Fatal编程技术网

Airflow 气流远程文件传感器

Airflow 气流远程文件传感器,airflow,airflow-scheduler,apache-airflow-xcom,Airflow,Airflow Scheduler,Apache Airflow Xcom,我正在尝试查找远程服务器中是否有与提供的模式匹配的文件。类似于下面的解决方案 我将SSHOperator与bash命令一起使用,如下所示 SSH_Bash = """ echo 'poking for files...' ls /home/files/test.txt if [ $? -eq "0" ]; then echo 'Found file' else echo 'failed to find' fi """ t

我正在尝试查找远程服务器中是否有与提供的模式匹配的文件。类似于下面的解决方案

我将SSHOperator与bash命令一起使用,如下所示

SSH_Bash = """
    echo 'poking for files...'
    ls /home/files/test.txt
    if [ $? -eq "0" ]; then    
    echo 'Found file'
    else
    echo 'failed to find'
    fi
    """

t1 = SSHOperator(
    ssh_conn_id='ssh_default',
    task_id='test_ssh_operator',
    command=SSH_Bash,
    dag=dag)
它可以工作,但看起来不是一个最佳的解决方案。有人能帮我找到比Bash脚本更好的解决方案来检测远程服务器中的文件吗

我试过下面的sftp传感器

import os
import re

import logging
from paramiko import SFTP_NO_SUCH_FILE
from airflow.contrib.hooks.sftp_hook import SFTPHook
from airflow.operators.sensors import BaseSensorOperator
from airflow.plugins_manager import AirflowPlugin
from airflow.utils.decorators import apply_defaults


class SFTPSensor(BaseSensorOperator):
    @apply_defaults
    def __init__(self, filepath,filepattern, sftp_conn_id='sftp_default', *args, **kwargs):
        super(SFTPSensor, self).__init__(*args, **kwargs)
        self.filepath = filepath
        self.filepattern = filepattern
        self.hook = SFTPHook(sftp_conn_id)

    def poke(self, context):
        full_path = self.filepath
        file_pattern = re.compile(self.filepattern)

        try:
            directory = os.listdir(self.hook.full_path)
            for files in directory:
                if not re.match(file_pattern, files):
                    self.log.info(files)
                    self.log.info(file_pattern)
                else:
                    context["task_instance"].xcom_push("file_name", files)
                    return True
            return False
        except IOError as e:
            if e.errno != SFTP_NO_SUCH_FILE:
                raise e
            return False

class SFTPSensorPlugin(AirflowPlugin):
    name = "sftp_sensor"
    sensors = [SFTPSensor]

但这总是会插入本地机器,而不是远程机器。谁能帮我一下我哪里出了错

我替换了

directory = os.listdir(self.hook.full_path)


将“directory=os.listdir(self.hook.full_path)”中的行替换为“directory=self.hook.list_directory(full_path)”。这里有一个问题,那么我应该创建一个名为SFTPSensor()或BaseSensorOperator()的进程吗??我不知道如何使用代码实现
directory = self.hook.list_directory(full_path)