Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/csharp/287.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python中此公式的特定正则表达式_Python_Regex_Amazon S3 - Fatal编程技术网

python中此公式的特定正则表达式

python中此公式的特定正则表达式,python,regex,amazon-s3,Python,Regex,Amazon S3,我正在尝试创建一个regexp来匹配s3存储桶中的这个文件。这是我试图创建的S3密钥,显然现在找不到它。下面我提供了一个bucket的快照,即我试图使用这个regexp等式访问的路径/文件夹 同样,这是一个s3键,下面我还将发布我正在使用的代码部分。我指的是try语句的内部 class process_raw_snowplow_event_data(luigi.Task): dataset_date = luigi.DateParameter(default=date.today() -

我正在尝试创建一个regexp来匹配s3存储桶中的这个文件。这是我试图创建的S3密钥,显然现在找不到它。下面我提供了一个bucket的快照,即我试图使用这个regexp等式访问的路径/文件夹

同样,这是一个s3键,下面我还将发布我正在使用的代码部分。我指的是
try
语句的内部

class process_raw_snowplow_event_data(luigi.Task):
    dataset_date = luigi.DateParameter(default=date.today() - timedelta(days=1))
    # force_run = luigi.BoolParameter()
    _start = luigi.DateSecondParameter(default=datetime.utcnow())
    file_root = luigi.Parameter()


def download_s3_file(self, s3_filename):

    local_filename = "/Users/xxx/etl/%s" % s3_filename

    s3_file_full_path =re.compile("snowplow-enrich-output/enriched/archive/run=" + self.dataset_date.strftime("%Y-%m-%d") +r"-\d{2}-\d{2}-\d{2}/*.")


    try:
        s3.download_file(Bucket=os.environ.get('SP_BUCKET'), Key=s3_file_full_path, Filename=local_filename)
    except Exception as e:
        logger.error("%s - Could not retrieve %s because: %s" % ("download_s3_file()", s3_filename, e))
        raise
错误:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/luigi/worker.py", line 199, in run
new_deps = self._run_get_new_deps()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/luigi/worker.py", line 139, in _run_get_new_deps
task_gen = self.task.run()
  File "target.py", line 123, in run
infile_name = self.download_s3_file(s3_filename)
  File "target.py", line 47, in download_s3_file
s3.download_file(Bucket=os.environ.get('SP_BUCKET'), Key=s3_filename, Filename=local_filename)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/boto3/s3/inject.py", line 172, in download_file
extra_args=ExtraArgs, callback=Callback)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/boto3/s3/transfer.py", line 307, in download_file
future.result()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/s3transfer/futures.py", line 73, in result
return self._coordinator.result()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/s3transfer/futures.py", line 233, in result
raise self._exception
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/s3transfer/tasks.py", line 255, in _main
self._submit(transfer_future=transfer_future, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/s3transfer/download.py", line 353, in _submit
**transfer_future.meta.call_args.extra_args
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/client.py", line 314, in _api_call
return self._make_api_call(operation_name, kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/botocore/client.py", line 612, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found
谢谢大家


S3.Client.download\u文件
不支持作为正则表达式的
参数

s3\u file\u full\u path
的计算结果与下面的对象类似

re.compile(r'snowplow-enrich-output/enriched/archive/run=2019-02-24-\d{2}-\d{2}-\d{2}/*.',
re.UNICODE)
在bucket的快照中,没有这样命名的对象

唯一的方法是列出bucket对象并将它们的键与上面的正则表达式对象相匹配

上面的正则表达式类似于列出具有名称前缀为“snowplow-enrich-output/enrich/archive/run=2019-02-24-”的键的对象

list_object_kwargs = {
    'Bucket': os.environ.get('SP_BUCKET'),
    'Prefix': 'snowplow-enrich-output/enriched/archive/run=2019-02-24-'
}

def object_keys(contents):
    return [content['Key'] for content in contents]

objects = s3.list_objects_v2(**list_object_kwargs)
found_object_keys = object_keys(objects['Contents'])

while objects['IsTruncated']:
    objects = s3.list_objects_v2(
        ContinuationToken=objects['ContinuationToken'],
        **list_object_kwargs
    )
    found_object_keys.extend(object_keys(objects['Contents']))

try:
    for key in found_object_keys:
        s3.download_file(
            Bucket=os.environ.get('SP_BUCKET'), 
            Key=key)
except Exception as e:
    logger.error(
        "Could not retrieve %s because: %s" % (key, e))
    raise

S3.Client.download\u文件
不支持作为正则表达式的
参数

s3\u file\u full\u path
的计算结果与下面的对象类似

re.compile(r'snowplow-enrich-output/enriched/archive/run=2019-02-24-\d{2}-\d{2}-\d{2}/*.',
re.UNICODE)
在bucket的快照中,没有这样命名的对象

唯一的方法是列出bucket对象并将它们的键与上面的正则表达式对象相匹配

上面的正则表达式类似于列出具有名称前缀为“snowplow-enrich-output/enrich/archive/run=2019-02-24-”的键的对象

list_object_kwargs = {
    'Bucket': os.environ.get('SP_BUCKET'),
    'Prefix': 'snowplow-enrich-output/enriched/archive/run=2019-02-24-'
}

def object_keys(contents):
    return [content['Key'] for content in contents]

objects = s3.list_objects_v2(**list_object_kwargs)
found_object_keys = object_keys(objects['Contents'])

while objects['IsTruncated']:
    objects = s3.list_objects_v2(
        ContinuationToken=objects['ContinuationToken'],
        **list_object_kwargs
    )
    found_object_keys.extend(object_keys(objects['Contents']))

try:
    for key in found_object_keys:
        s3.download_file(
            Bucket=os.environ.get('SP_BUCKET'), 
            Key=key)
except Exception as e:
    logger.error(
        "Could not retrieve %s because: %s" % (key, e))
    raise

只是一个想法-试着打印出
s3\u文件\u完整路径
看看它是否是你所期望的?只是一个想法-试着打印出
s3\u文件\u完整路径
看看它是否是你所期望的?