python正则表达式：解析文件名_Python_Bash_File Handling_File Rename_Re

python正则表达式：解析文件名

python bash

python正则表达式：解析文件名,python,bash,file-handling,file-rename,re,Python,Bash,File Handling,File Rename,Re,我有一个文本文件（filenames.txt），其中包含文件名及其扩展名 filename.txt [AW] One Piece - 629 [1080P][Dub].mkv EP.585.1080p.mp4 EP609.m4v EP 610.m4v One Piece 0696 A Tearful Reunion! Rebecca and Kyros!.mp4 One_Piece_0745_Sons'_Cups!.mp4 One Pie

我有一个文本文件（filenames.txt），其中包含文件名及其扩展名

filename.txt

    [AW] One Piece - 629 [1080P][Dub].mkv
    EP.585.1080p.mp4
    EP609.m4v
    EP 610.m4v
    One Piece 0696 A Tearful Reunion! Rebecca and Kyros!.mp4
    One_Piece_0745_Sons'_Cups!.mp4
    One Piece - 591 (1080P Funi Web-Dl -Ks-)-1.m4v
    One Piece - 621 1080P.mkv
    One_Piece_S10E577_Zs_Ambition_A_Great_and_Desperate_Escape_Plan.mp4

以下是示例文件名及其扩展名。我需要用剧集号重命名文件名（不更改其扩展名）

示例：

希望有人能帮我解析和重命名这些文件名。提前感谢

当您标记python时，我想您愿意使用python

（编辑：我意识到在我的原始代码中不需要循环。）

重新导入
将open（'filename.txt'，'r'）作为f：
files=f.read（）.splitlines（）#读取文件名
#假设：一集由3位数字组成，前面可能有0
p=re.compile（r'0？（\d{3}'））
对于文件中的文件：
如果m:=p.search（文件）：
打印（m.group（1）+'.'.+file.split（'.'）[-1]）
其他：
打印（文件）

这将输出

609.m4v
610.m4v
585.mp4
621.mkv
629.mkv 
745.mp4
696.mp4
591.m4v
577.mp4

基本上，它搜索第一个3位数字，前面可能是0

我强烈建议您检查输出；特别是，您需要运行

sort OUTPUTFILENAME | uniq-d

以查看是否存在重复的目标名称

（原始答复：）

p=re.compile（r'\d{3,4}'）
对于文件中的文件：
对于p.finditer中的m（文件）：
ep=m组（0）
如果int（ep）<1000：
打印（ep.lstrip（'0'）+'.+file.split（'.'）[-1]）
break#如果找到ep，则转到下一个文件（避免else子句）
否则：#如果找不到ep，请按原样打印文件名
打印（文件）

用于解析剧集编号并重命名它的程序

使用的模块：

re-解析文件名
os-重命名文件名

full/path/to/folder

-是指向文件所在文件夹的路径

import re
import os

for file in os.listdir(path="full/path/to/folder/"):
    # searches for the first 3 or 4 digit number less than 1000 for each line.
    for match_obj in re.finditer(r'\d{3,4}', file):
        episode = match_obj.group(0)   
        if int(episode) < 1000:
            new_filename = episode.lstrip('0') + '.' + file.split('.')[-1]
            old_name = "full/path/to/folder/" + file
            new_name = "full/path/to/folder/" + new_filename
            os.rename(old_name, new_name)
            # go to next file if ep found (avoid the else clause)
            break 
    else:
       # if episode not found, just leave the filename as it is
       pass

重新导入
导入操作系统
对于os.listdir（path=“full/path/to/folder/”）中的文件：
#搜索每行小于1000的前3位或4位数字。
对于re.finditer（r'\d{3,4}'文件中的匹配对象）：
事件=匹配对象组（0）
如果int（插曲）<1000：
新文件名=插曲.lstrip（'0'）+'.+文件.split（'.'）[-1]
old_name=“full/path/to/folder/”+文件
new_name=“full/path/to/folder/”+新文件名
重命名（旧名称、新名称）
#如果找到ep，则转到下一个文件（避免else子句）
打破
其他：
#如果找不到插曲，请保持文件名不变
通过

考虑到许多文件名中都有多个数值，您必须更好地定义如何确定剧集编号。例如，您可以有一个忽略“1080”的规则，但如果一个系列的运行时间足够长，实际有1080集，会发生什么？是的，您是对的，但文件名中的集不会超过999集。到目前为止的最后一集是973，因此您可以设置忽略1080的规则。

609.m4v
610.m4v
585.mp4
621.mkv
629.mkv 
745.mp4
696.mp4
591.m4v
577.mp4

p = re.compile(r'\d{3,4}')

for file in files:
    for m in p.finditer(file):
        ep = m.group(0)
        if int(ep) < 1000:
            print(ep.lstrip('0') + '.' + file.split('.')[-1])
            break # go to next file if ep found (avoid the else clause)
    else: # if ep not found, just print the filename as is
        print(file)

import re
import os

for file in os.listdir(path="full/path/to/folder/"):
    # searches for the first 3 or 4 digit number less than 1000 for each line.
    for match_obj in re.finditer(r'\d{3,4}', file):
        episode = match_obj.group(0)   
        if int(episode) < 1000:
            new_filename = episode.lstrip('0') + '.' + file.split('.')[-1]
            old_name = "full/path/to/folder/" + file
            new_name = "full/path/to/folder/" + new_filename
            os.rename(old_name, new_name)
            # go to next file if ep found (avoid the else clause)
            break 
    else:
       # if episode not found, just leave the filename as it is
       pass