Python 从文件中筛选特定长度的字符串

Python 从文件中筛选特定长度的字符串,python,string,file,python-2.7,extract,Python,String,File,Python 2.7,Extract,我有一个包含内容的foo.txt文件 'w3ll' 'i' '4m' 'n0t' '4sed' 't0' 'it' 我试图提取所有包含两个字符的单词。我的意思是,输出文件应该只有 4m t0 it 我试过的是 with open("foo.txt" , 'r') as foo: listme = foo.read() string = listme.strip().split("'") 我想这会用‘符号’分开字符串。 如何仅选择那些撇号中字符计数等于2的字符串?这应该可以:

我有一个包含内容的foo.txt文件

'w3ll' 'i' '4m' 'n0t' '4sed' 't0' 

'it'
我试图提取所有包含两个字符的单词。我的意思是,输出文件应该只有

4m
t0
it
我试过的是

with open("foo.txt" , 'r') as foo:
    listme = foo.read()

string =  listme.strip().split("'")
我想这会用‘符号’分开字符串。 如何仅选择那些撇号中字符计数等于2的字符串?

这应该可以:

>>> with open('abc') as f, open('output.txt', 'w') as f2:
...     for line in f:
...         for word in line.split():    #split the line at whitespaces
...             word = word.strip("'")   # strip out `'` from each word
...             if len(word) == 2:       #if len(word) is 2 then write it to file
...                 f2.write(word + '\n')

print open('output.txt').read()
4m
t0
it
使用正则表达式:


考虑到您希望查找符号中包含的所有单词,这些单词正好有两个字符长:

import re
split = re.compile(r"'\w{2}'")

with open("file2","w") as fw:
    for word in split.findall(open("file","r").read()):
            fw.write(word.strip("'")+"\n")

由于您正在阅读由空格或逗号分隔的引号,因此可以使用csv模块:

import csv

with open('/tmp/2let.txt','r') as fin, open('/tmp/out.txt','w') as fout:
    reader=csv.reader(fin,delimiter=' ',quotechar="'")
    source=(e for line in reader for e in line)             
    for word in source:
        if len(word)<=2:
            print(word)
            fout.write(word+'\n')

@有什么错误吗?请在问题正文而非评论中发布此类示例,因为它们不可读。谢谢@Ashwini。但是regex方法将两个由逗号分隔的不同字符串作为一个字符串。当我运行代码查找20个字符时。word,它给了我,'9,'1186148119',作为输出,这仍然有效,但它由许多不同的字符串组成,而不仅仅是一个。@abhikafle您的示例输入不包含任何','这就是为什么我没有处理它们。请把这些有问题的东西贴出来。你能添加“,”作为分隔两个字符串的标记吗?@abhikafle在第一个代码中将line.split替换为line.split“,”
import re
split = re.compile(r"'\w{2}'")

with open("file2","w") as fw:
    for word in split.findall(open("file","r").read()):
            fw.write(word.strip("'")+"\n")
import csv

with open('/tmp/2let.txt','r') as fin, open('/tmp/out.txt','w') as fout:
    reader=csv.reader(fin,delimiter=' ',quotechar="'")
    source=(e for line in reader for e in line)             
    for word in source:
        if len(word)<=2:
            print(word)
            fout.write(word+'\n')
i
4m
t0