python中的筛选器目录_Python_Filesystems

python中的筛选器目录

python filesystems

python中的筛选器目录,python,filesystems,Python,Filesystems,我试图得到所有文本和Python文件的过滤列表，如下所示 from walkdir import filtered_walk, dir_paths, all_paths, file_paths vdir=raw_input ("enter director :") files = file_paths(filtered_walk(vdir, depth=0,included_files=['*.py', '*.txt'])) 我想：知道在给定目录中找到的文件总数我尝试

我试图得到所有文本和Python文件的过滤列表，如下所示

from walkdir import filtered_walk, dir_paths, all_paths, file_paths
vdir=raw_input ("enter director :")

files = file_paths(filtered_walk(vdir, depth=0,included_files=['*.py', '*.txt']))

我想：

知道在给定目录中找到的文件总数

我尝试过这样的选项：Number_of_files=len files或for n in files n=n+1，但都失败了，因为文件是一个名为generator Object的东西，我在python文档中搜索了它，但没有使用它

我还想在上面找到的文件列表中找到一个字符串，例如import sys，并将具有我的搜索字符串的文件名存储在名为find.txt的新文件中

你应该试试os.walk

import os
dir = raw_input("Enter Dir:")
files = [file for path, dirname, filenames in os.walk(dir) for file in filenames if file[-3:] in [".py", ".txt"]]

nfiles = len(files)
print nfiles

要在文件中搜索字符串，请查看

将这两者结合起来，您的代码将类似于

import os
import mmap

dir = raw_input("Enter Dir:")
print "Directory %s" %(dir) 
search_str = "import sys" 
count = 0
search_count = 0
write_file = open("found.txt", "w")
for dirpath, dirnames, filenames in os.walk(dir):
    for file in filenames:
        if file.split(".")[-1] in ["py", "txt"]:
            count += 1
            print dirpath, file
            f = open(dirpath+"/"+file)
            #            print f.read()

            if search_str in f.read():
                search_count += 1
                write_file.write(dirpath+"/"+file)

write_file.close()
print "Number of files: %s" %(count)
print "Number of files containing string: %s" %(search_count)

python生成器是一种特殊的迭代器。它一个接一个地生成一个项目，而不预先知道有多少项目。你只能在最后才知道

不过，这样做应该没问题

n = 0
for item in files:
    n += 1
    do_something_with(items)
print "I had", n, "items."

您可以将生成器或迭代器视为一个列表，一次只提供一项。不，它不是一个列表。所以，你不能计算它会给你多少东西，除非你把它们全部看一遍，因为你必须一个接一个地拿。这只是一个基本的想法，现在你应该能够理解这些文档，我相信这里也有很多关于它们的问题

现在，就您的情况而言，您使用了一种不那么错误的方法：

count = 0
for filename in files:
    count += 1

你做错的是取f并递增，但f是文件名！递增毫无意义，也是一个例外

一旦您有了这些文件名，您就必须打开每个单独的文件，读取它，搜索字符串并返回文件名

def contains(filename, match):
    with open(filename, 'r') as f:
        for line in f:
            if f.find(match) != -1:
                return True
    return False

匹配_文件=[] 对于文件中的文件名：如果包含文件名，则导入系统：匹配_file.appendfilename 或一行： match_files=[f代表文件中的f如果包含sf，则导入sys]

现在，作为生成器的一个示例，在阅读文档之前不要阅读以下内容：

def matching(filenames):
    for filename in files:
        if contains(filename, "import sys"):
            # feed the names one by one, you are not storing them in a list
            yield filename
# usage:
for f in matching(files):
    do_something_with_the_files_that_match_without_storing_them_all_in_a_list()

我相信这是你想要的，如果我误解了你的规格，请让我知道后，你给这个测试。我已经硬编码了目录searchdir，所以您必须提示输入它

import os

searchdir = r'C:\blabla'
searchstring = 'import sys'

def found_in_file(fname, searchstring):
    with open(fname) as infp:
        for line in infp:
            if searchstring in line:
                return True
        return False

with open('found.txt', 'w') as outfp:
    count = 0
    search_count = 0
    for root, dirs, files in os.walk(searchdir):
        for name in files:
            (base, ext) = os.path.splitext(name)
            if ext in ('.txt', '.py'):
                count += 1

            full_name = os.path.join(root, name)
            if found_in_file(full_name, searchstring):
               outfp.write(full_name + '\n')
               search_count += 1

print 'total number of files found %d' % count
print 'number of files with search string %d' % search_count

使用with打开文件也会在以后为您自动关闭文件。

-1您的文件[-3:]应该只比较/查找长度为3的扩展名，目前它不会找到.txt，但会找到txt，并且计数将关闭。它适用于。pyIt为最后一行提供错误：打印包含字符串%search\u count类型的文件数错误：在字符串转换过程中未转换所有参数formatting@x0rcist该行缺少format指令。它应该是这样的：打印包含字符串的文件数：%d%search\u count-注意%d。因此，我不知道为什么上面的行使用%s而不是%d来显示计数。现在的解决方案将找不到/count.txt文件，您应该测试它的行为。@Levon您是对的，它不适用于您前面强调的.txt文件。我正在试图弄清楚如何为.txt做这件事。如果有人对walkdir有更好的方法或建议，请分享。Thanks@x0rcist如果你将问题取消标记为已回答，你会让其他人回来查看你的问题并尝试提供答案，否则每个人都会认为这个问题已经解决。我已经发布了一个解决方案。那就是：我一直在寻找。谢谢现在，让我了解一下，看看如何添加正则表达式来查找搜索字符串的所有实例。@x0rcist如果您对代码的任何部分有疑问，请告诉我。@acid\u cruchfix您的解决方案不起作用，甚至没有运行。我建议对OP进行修复，使您的程序至少能够运行，但您的代码仍然丢失了所有的.txt文件。我等了一会儿，想知道你是否会修改你的代码，然后给出了我自己的答案。。这里没有偷窃，你的评论是不恰当的和冒犯性的。这没有使用图书馆OP正在使用。。。但它是有效的：所以没有评论。@jadkik94:-你是对的，我甚至没有注意到这一点。再说一遍，这似乎不是一个要求，OP似乎对这个解决方案很满意。