返回在文件名Python中找到的最新.tar文件列表

返回在文件名Python中找到的最新.tar文件列表,python,Python,我在“supertar”文件夹中有各种tar文件,标记为:- esarchive--Mona-AB-Test226-8037affd-06d1-4c61-a91f-816ec9cb825f-05222017-4.tar, esarchive--Jackson-HQ-112-ecb5ab6a-c199-402d-9a8a-8c54c8901d66-06092017-4.tar, esarchive--Mona-AB-Test226-8037affd-06d1-4c61-a91f-816ec9cb8

我在“supertar”文件夹中有各种tar文件,标记为:-

esarchive--Mona-AB-Test226-8037affd-06d1-4c61-a91f-816ec9cb825f-05222017-4.tar,
esarchive--Jackson-HQ-112-ecb5ab6a-c199-402d-9a8a-8c54c8901d66-06092017-4.tar,
esarchive--Mona-AB-Test226-8037affd-06d1-4c61-a91f-816ec9cb825f-05202017-4.tar,
esarchive--Jackson-HQ-112-ecb5ab6a-c199-402d-9a8a-8c54c8901d66-06012017-4.tar,
esarchive--Jonah-7fbbbc6c-8463-4ec1-9bde-3fc5429311e5-06092017-4
如何根据每个客户的日期值(文件名末尾)提取其各自文件名中提到的Mona、Jackson、Jonah等客户的最新.tar文件名,以便获得一个具有以下值的变量:

esarchive--Mona-AB-Test226-8037affd-06d1-4c61-a91f-816ec9cb825f-05222017-4.tar,
esarchive--Jackson-HQ-112-ecb5ab6a-c199-402d-9a8a-8c54c8901d66-06092017-4.tar,
esarchive--Jonah-7fbbbc6c-8463-4ec1-9bde-3fc5429311e5-06092017-4
到目前为止,我已经执行了以下代码:-

    for file in glob.glob("*.tar"):
   # print "The File Being Untarred is:",file
    file_date_str = file.split('-')[-2]
    datetime_obj = datetime.datetime.strptime(file_date_str, '%m%d%Y')
    a=re.match("esarchive--(\w+)-(\w+)-(\w+)", file).group(1)# Gets Mona from file name
    b=re.match("esarchive--(\w+)-(\w+)-(\w+)", file).group(2)# Gets AB from file name
    c=re.match("esarchive--(\w+)-(\w+)-(\w+)", file).group(3)# Gets Test226 from file name
    s = a+'-'+b+'-'+c
    d=s.lower()
    my_dict={}
    date = datetime.date.today()
    print(s)
    try:
        (latest_date, _) = my_dict['name'] # _ has file name, which you don't want to compare.
        if date > latest_date:
        # If entry for this name exists,
        # Replace the info with latest date.
           my_dict['name'] = (date,file)
    except KeyError:
    # No info for this name in dictionary.
           my_dict['name'] = (date,file)

    print "The File Being Untarred is:",my_dict['name']
    tar = tarfile.open("/home/chetan/Desktop/supertar/"+my_dict['name'][1]) 
    tar.extractall(path="/home/chetan/Documents/chetan-dump-es") # untar file into same directory
    tar.close()

我得到的是所有文件的列表,而不是最新的文件。

我觉得你的数据看起来很熟悉。。。我们不是已经在你的简历上写过了吗?通过一些小的调整,前面的答案可以适应这种情况——你不需要所有文件的列表,只需要每个“客户”的顶级值,你可以提取一个“客户”,就像提取日期一样(你根本不需要完全解析,如我前面的答案所示)

比如:

def parse_date(name, offset=-10):  # lets re-use our convenience function
    try:
        date_str = name[offset:offset+8]
        return int(date_str[-4:] + date_str[:2] + date_str[2:4])
    except (IndexError, TypeError, ValueError):  # invalid file name
        return -1

result = {}  # use this as our result / lookup table
for file_name in glob.glob("*.tar"):
    # for customer name, skip `esarchive--` and pick everything until the next dash
    customer = file_name[11:file_name.find("-", 11)]
    date = parse_date(file_name, -14)
    # now replace our stored value if it's older than the date in our current file name
    if result.get(customer, [-1])[0] < date:
        result[customer] = [date, file_name]  # store the parsed date and file name

您不希望每次都更新同一个dict键“name”,您可能需要更多类似于
my_dict[filename]=date
的内容,如错误所示,
date
未定义。我想你想用
d
而不是
date
my_dict['name']=(d,文件名)
for k, v in result.items():
    print("Customer: {}\n\tDate: {}\n\tFile: {}".format(k, v[0], v[1]))
# prints:
# Customer: Jonah
#   Date: 20170609
#   File: esarchive--Jonah-7fbbbc6c-8463-4ec1-9bde-3fc5429311e5-06092017-4.tar
# Customer: Jackson
#   Date: 20170609
#   File: esarchive--Jackson-HQ-112-ecb5ab6a-c199-402d-9a8a-8c54c8901d66-06092017-4.tar
# Customer: Mona
#   Date: 20170522
#   File: esarchive--Mona-AB-Test226-8037affd-06d1-4c61-a91f-816ec9cb825f-05222017-4.tar

# or if you just want the list of file names:
file_names = [entry[1] for entry in result.values()]
# ['esarchive--Jonah-7fbbbc6c-8463-4ec1-9bde-3fc5429311e5-06092017-4.tar',
#  'esarchive--Jackson-HQ-112-ecb5ab6a-c199-402d-9a8a-8c54c8901d66-06092017-4.tar',
#  'esarchive--Mona-AB-Test226-8037affd-06d1-4c61-a91f-816ec9cb825f-05222017-4.tar']