在for循环python中创建和删除同名文件的最有效方法_Python_Amazon Web Services_Operating System

在for循环python中创建和删除同名文件的最有效方法

python amazon-web-services operating-system

在for循环python中创建和删除同名文件的最有效方法,python,amazon-web-services,operating-system,Python,Amazon Web Services,Operating System,我从aws s3下载数据，然后解压，格式化数据，然后尝试将解压，格式化的文件发送到aws elasticsearch 对于每个文件，它将在我的本地计算机上创建为“access.log”，解压后将其删除，格式化，发送到aws elasticsearch，然后重复该过程但是，当我尝试使用相同的名称执行此操作时，它会告诉我在处理文件时无法删除，我不想创建多个文件。最有效的方法是什么注意：acc_日志是许多日志文件的列表。例如：acc_log_file.gz 这是我的密码： for log_file

我从aws s3下载数据，然后解压，格式化数据，然后尝试将解压，格式化的文件发送到aws elasticsearch

对于每个文件，它将在我的本地计算机上创建为“access.log”，解压后将其删除，格式化，发送到aws elasticsearch，然后重复该过程

但是，当我尝试使用相同的名称执行此操作时，它会告诉我在处理文件时无法删除，我不想创建多个文件。最有效的方法是什么

注意：acc_日志是许多日志文件的列表。例如：acc_log_file.gz

这是我的密码：

for log_file in acc_logs: # one file
    bucket.download_file(log_file, 'C:\\Users\\name\\Desktop\\s3-to-es\\access.log')
    with gzip.open('C:\\Users\\name\\Desktop\\s3-to-es\\access.log') as log_file:
        for line in log_file:
            line = line.decode("utf-8") # decode byte to str
            try:
                ip = ip_pattern.search(line).group(0)
            except:
                ip = None
            try:
                host = host_pattern.search(line).group(0)[1:-1]
            except:
                host = None

            date            = time_pattern.search(line).group(0)[1:-1]
            pos             = [x.end() for x in re.finditer('"', line)]
            request         = line[pos[0]:pos[1]-1]
            referr_url      = line[pos[2]:pos[3]-1]
            user_agent      = line[pos[4]:pos[5]-1]
            status, bytes   = line[pos[1]+1:pos[2]-2].split()
            ident, authuser = authuser_pattern.search(line).group(0).split()
            document = {"server_ip": ip,
                        "host":host, 
                        "ident":ident, 
                        "authuser":authuser, 
                        "date": date, 
                        "request":request, 
                        "status":status, 
                        "bytes":bytes,
                        "referr_url":referr_url, 
                        "user_agent":user_agent
                       }

            ToElasticSearch(document) #upload each line to es

            if count == 10:
                break

        os.remove('C:\\Users\\name\\Desktop\\s3-to-es\\access.log')

ToElasticSearch看起来像：

def ToElasticSearch(document):
    """
      upload given document to specified index in AWS
    """
    awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
    es = Elasticsearch(
        hosts = [{'host': es_host, 'port': 443}],
        http_auth = awsauth,
        use_ssl = True,
        verify_certs = True,
        connection_class = RequestsHttpConnection
        )
    es.index(index="acc_logs", doc_type="_doc", body=document)

您正试图在文件仍处于打开状态时删除该文件：

以gzip.open（'C:\\Users\\name\\Desktop\\s3 to es\\access.log'）作为日志文件：
对于日志文件中的行：
……诸如此类。。。
os.remove（'C:\\Users\\name\\Desktop\\s3到es\\access.log'）

如果将

os.remove（）

行上的缩进向外更改一个级别（更少的空格），文件将不会打开，它应该可以删除