Python 在目录中循环后返回文件名_Python

Python 在目录中循环后返回文件名

python

Python 在目录中循环后返回文件名,python,Python,在使用os.scandir循环遍历所有.txt文件后，我试图返回文件名的一部分。我取了一个目录，在目录中的每个文本文件中搜索特定的单词，提取找到这些单词的部分，然后打印。当这部分工作时，我需要添加文本部分所在的文件名。类似HD354950：供应链问题被发现与花园侏儒下面是从文本中返回信息的工作代码- dict = [] linenumber = 0 pattern = re.compile(r"\bsupply|finance\b", re.IGNORECASE) for

在使用os.scandir循环遍历所有.txt文件后，我试图返回文件名的一部分。我取了一个目录，在目录中的每个文本文件中搜索特定的单词，提取找到这些单词的部分，然后打印。当这部分工作时，我需要添加文本部分所在的文件名。类似HD354950：供应链问题被发现与花园侏儒

下面是从文本中返回信息的工作代码-

dict = []
linenumber = 0
pattern = re.compile(r"\bsupply|finance\b", re.IGNORECASE)

for filename in os.scandir(directory):
    if filename.path.endswith(".txt"):
        f = open(filename, encoding = 'utf-8')
            lines = f.readlines()
            for line in lines:
                linenumber += 1
                if pattern.search(line) != None:
                    dict.append((linenumber, line.rstrip('\n')))
        continue
    else:
        continue

当返回文本时，我希望能够提取文本与文本本身一起找到的文件名。文件名通常是-HD_0000354950_10Q_20200503_Item1A_extract.txt，我想返回HD 354950

我想把它和返回的输出连接起来

for d in dict:
   print(filenamepieces, ":" + d[1])

其中，“filenamepieces”是从中获取文本标题的文件。下面是一个使用

split（）

并将字符串转换为

int

的示例：

fileName = "HD_0000354950_10Q_20200503_Item1A_excerpt.txt" # The name of the file

splitFile = fileName.split("_") # Splits the file name with underscores (_) into sections 
index1 = splitFile[0] # Gets the name at the first index
index2 = splitFile[1] # Gets the name at the second index
index2 = int(index2) # Converts the second name into an int to remove the unnecessary zeros

finale = f"{index1} {index2}" # Final string
print(finale) # Prints the final string

# Program outputs : HD 354950

文件名是什么？你想退哪一份？有可能。请使用文档或在线教程获取direction@mkrieger1只是文件名的一小部分。更新了我的原件post@Justlearnedit您可以提供链接吗？您只需在

\uuu

上拆分字符串，获取前两项，然后将第二项转换为整数。我收到一个错误，nt.DirEntry对象没有属性“split”，您需要先将其转换为字符串，然后才能拆分它，如：

str（）

接收到相同的错误-不确定是什么原因，因为我正在（目录）中拉入一个文件列表，并且它不会因此拆分。我没有能力将一个特定的文件设置为一个变量。我尝试过在目录上拆分文件，这是可行的，但是它比我需要的级别高了一个级别-希望这是有意义的。您可以将所有名称添加到一个数组中，并以这种方式对每个名称进行迭代。