Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/278.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 删除所有不是a到z的内容_Python - Fatal编程技术网

Python 删除所有不是a到z的内容

Python 删除所有不是a到z的内容,python,Python,这是我的节目: def count_occurances(lst): d = {} contents = lst #Filling the dictonary contents = map(str, contents) for contents in lst: if contents not in d: #Adds every item d[contents] = 1 else:

这是我的节目:

def count_occurances(lst):
    d = {}
    contents = lst #Filling the dictonary
    contents =  map(str, contents)
    for contents in lst:
        if contents not in d: #Adds every item
            d[contents] = 1 
        else:
            d[contents]=d[contents] +1 #Increases the value of the item if it already exists
    return d

listy_mc_listface = []
with open("\\Users\\saksa\\python_courses\\1DV501\\assign3\\eng_news_100K-sentences_B.txt", "r", encoding= 'utf-8') as f:
    string_name=f
    for line in string_name:
        for item in line:
            item = (item.lower())
            if (item.isalpha()) == True:
                listy_mc_listface.append(item)


freq = count_occurances(listy_mc_listface)
print (freq)
            

虽然isalpha删除了我想删除的大部分字符,但我看到仍有一些中文和阿拉伯文字符。有什么更好的方法可以删除它们?我将如何在代码中实现它?为了澄清我只想从a到z,大写和非大写字母(因为我已经转换了htem)

正则表达式是您需要的:
import re;re.sub(“[^a-zA-Z],”,您的_字符串)
将不在a-Z或a-Z中的所有内容替换为空字符串。这就是你要找的吗?@Nikrasmertsch我尝试过使用该方法,但我得到了:“预期字符串或类似字节的对象”,然后要么将输入转换为字符串,要么使用字节而不是字符串作为模式(
b”[^a-zA-]z”
,b”)。