Python 删除所有不是a到z的内容_Python

Python 删除所有不是a到z的内容

python

Python 删除所有不是a到z的内容,python,Python,这是我的节目： def count_occurances(lst): d = {} contents = lst #Filling the dictonary contents = map(str, contents) for contents in lst: if contents not in d: #Adds every item d[contents] = 1 else:

这是我的节目：

def count_occurances(lst):
    d = {}
    contents = lst #Filling the dictonary
    contents =  map(str, contents)
    for contents in lst:
        if contents not in d: #Adds every item
            d[contents] = 1 
        else:
            d[contents]=d[contents] +1 #Increases the value of the item if it already exists
    return d

listy_mc_listface = []
with open("\\Users\\saksa\\python_courses\\1DV501\\assign3\\eng_news_100K-sentences_B.txt", "r", encoding= 'utf-8') as f:
    string_name=f
    for line in string_name:
        for item in line:
            item = (item.lower())
            if (item.isalpha()) == True:
                listy_mc_listface.append(item)


freq = count_occurances(listy_mc_listface)
print (freq)

虽然isalpha删除了我想删除的大部分字符，但我看到仍有一些中文和阿拉伯文字符。有什么更好的方法可以删除它们？我将如何在代码中实现它？为了澄清我只想从a到z，大写和非大写字母（因为我已经转换了htem）

正则表达式是您需要的：

import re；re.sub（“[^a-zA-Z]，”，您的_字符串）

将不在a-Z或a-Z中的所有内容替换为空字符串。这就是你要找的吗？@Nikrasmertsch我尝试过使用该方法，但我得到了：“预期字符串或类似字节的对象”，然后要么将输入转换为字符串，要么使用字节而不是字符串作为模式（

b”[^a-zA-]z”

，b”）。