Python 将特定单词与列表中的字符串匹配。完全匹配而不是部分匹配
我的代码:Python 将特定单词与列表中的字符串匹配。完全匹配而不是部分匹配,python,Python,我的代码: delete = ["man", "eat"] item_list = ['sharper_task|$none_venue|man', 'sharper_task|man_venue|king', 'sharper_task|king_venue|world', 'sharper_task|world_venue|dont', 'sharper_task|を_venue|eater', 'sharper_task|eater_venue|todo', 'sharper_task|
delete = ["man", "eat"]
item_list = ['sharper_task|$none_venue|man', 'sharper_task|man_venue|king', 'sharper_task|king_venue|world', 'sharper_task|world_venue|dont', 'sharper_task|を_venue|eater', 'sharper_task|eater_venue|todo', 'sharper_task|todo_venue|,']
lst = []
for x in item_list:
if not any(y in x for y in delete):
lst.append([x, x])
print(lst)
然而,这种方法会使我的输出变得非常麻烦。例如,如果我的delete包含delete=[“man”,“eat”],这与item_列表中的单词“eater”不相似,但程序将接受它,因为我使用了if not any(y IN x),这个“IN”将返回true,因为eat包含在eater中,但我想要的不是包含在单词中,而是匹配的。我想把单词eater与eater、man与man匹配起来,而不是eater与ma与man匹配
lst = []
for x in item_list:
if not any(y in x for y in delete):
lst.append([x, x])
print(lst)
有没有办法做到完全匹配而不是部分匹配??我当前的代码采用部分匹配,这在删除中有许多部分单词时是错误的。然后您可以检查字符串的精确匹配:
lst = []
for x in item_list:
if not any(y in x for y in delete):
lst.append([x, x])
print(lst)
注意:
或
操作符在字符串中没有任何用处,就像在的“更尖锐的任务”或“吃的地方”todo中一样。
您可以先用
将字符串拆分为子字符串,然后再使用中的操作符来测试中的项目是否在与先前字符串进一步拆分的子字符串之一中带有的子字符串
:
lst = []
for x in item_list:
if not any(y in x for y in delete):
lst.append([x, x])
print(lst)
delete = ["man", "eat"]
item_list = ['sharper_task|$none_venue|man', 'sharper_task|man_venue|king', 'sharper_task|king_venue|world', 'sharper_task|world_venue|dont', 'sharper_task|を_venue|eater', 'sharper_task|eater_venue|todo', 'sharper_task|todo_venue|,']
lst = []
for x in item_list:
if not any(y == x for y in delete):
lst.append([x, x])
print(lst)
# [['sharper_task|$none_venue|man', 'sharper_task|$none_venue|man'], ['sharper_task|man_venue|king', 'sharper_task|man_venue|king'], ['sharper_task|king_venue|world', 'sharper_task|king_venue|world'], ['sharper_task|world_venue|dont', 'sharper_task|world_venue|dont'], ['sharper_task|を_venue|eater', 'sharper_task|を_venue|eater'], ['sharper_task|eater_venue|todo', 'sharper_task|eater_venue|todo'], ['sharper_task|todo_venue|,', 'sharper_task|todo_venue|,']]
这将产生:
lst = []
for x in item_list:
if not any(y in x for y in delete):
lst.append([x, x])
print(lst)
['sharper|u task | man|u venue | king'、[sharper|u task | king|u venue | world'、[sharper|u task | world|venue | dont'、[sharper| task | world| venue | dont']、[sharper| task|を_地点|食客| | | | | | | | | | | | |任务更尖锐|を_场地|食客“],[“更尖锐的任务|食客|场地| todo”,“更尖锐的任务|食客|场地| todo”],[“更尖锐的任务|场地|,todo”]
假设您希望在管道字符上拆分
lst = []
for x in item_list:
if not any(y in x for y in delete):
lst.append([x, x])
print(lst)
lst = []
for x in item_list:
if not any(y in s.split('_') for s in x.split('|') for y in delete):
lst.append([x, x])
print(lst)
试试下面的方法-
lst = []
for x in item_list:
if not any(y in x for y in delete):
lst.append([x, x])
print(lst)
delete = ["man", "eat"]
item_list = ['sharper_task|$none_venue|man', 'sharper_task|man_venue|king', 'sharper_task|king_venue|world', 'sharper_task|world_venue|dont', 'sharper_task|を_venue|eater', 'sharper_task|eater_venue|todo', 'sharper_task|todo_venue|,']
lst = [item
for item in item_list
if any(word in item.split('|') for word in delete)]
本项目的产出-
lst = []
for x in item_list:
if not any(y in x for y in delete):
lst.append([x, x])
print(lst)
import re
del_list = ["man", "eat"]
regex = '|'.join([r'\b' + y + r'\b' for y in del_list])
item_list = ['sharper_task|$none_venue|man', 'sharper_task|man_venue|king', 'sharper_task|king_venue|world', 'sharper_task|world_venue|dont', 'sharper_task|を_venue|eater', 'sharper_task|eater_venue|todo', 'sharper_task|todo_venue|,']
lst = []
for x in item_list:
if not re.search(regex, x):
lst.append([x, x])
print(lst)
使用单个正则表达式而不是列表可确保每个“待删除”项的匹配不会将项列表元素引入到输出列表中,而输出列表已被先前的“待删除”项删除
lst = []
for x in item_list:
if not any(y in x for y in delete):
lst.append([x, x])
print(lst)
regex='|'.join()-这里它使用带'\b'的原始(r'')字符串创建正则表达式,以匹配单词边界(由非字母数字字符标识)。阅读更多关于它的信息
lst = []
for x in item_list:
if not any(y in x for y in delete):
lst.append([x, x])
print(lst)
如果我们使用2个循环,一个用于del_列表,另一个用于item_列表,那么输出将如下所示,我认为这是不正确的,因为“man”列表仍然出现一次,因为“eat”不匹配。其余的项目即使与del_列表中的一项也不匹配,但会出现两次-
lst = []
for x in item_list:
if not any(y in x for y in delete):
lst.append([x, x])
print(lst)
[['sharper_task|man_venue|king', 'sharper_task|man_venue|king'], ['sharper_task|king_venue|world', 'sharper_task|king_venue|world'], ['sharper_task|world_venue|dont', 'sharper_task|world_venue|dont'], ['sharper_task|を_venue|eater', 'sharper_task|を_venue|eater'], ['sharper_task|eater_venue|todo', 'sharper_task|eater_venue|todo'], ['sharper_task|todo_venue|,', 'sharper_task|todo_venue|,']]
没错,但现在这个人也在输出中。“人”不应该在输出中。有没有办法将“更尖锐的任务”、“人场地-1”和“更尖锐的任务”、“任务”、“人”、“人”、“场地”、“人”、“1”、“场地”、“0”、“国王”分开?有没有办法将“更尖锐的任务”、“人场地-1”和“国王”和“更尖锐的任务”、“任务”、“场地”、“人”、“场地”、“1”、“国王”分开?初学者您能发布您期望的输出吗?有没有办法将“sharper_task | man_Vince-1_0_king”拆分为“sharper”、“”、“”、“”、“”、“|”、“man”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”?这样做的逻辑是将所有下划线“|”替换为空格,将所有“|”替换为“|”,使用“-”替换所有“-”,然后在单个空格字符上拆分生成的字符串。不知道你为什么要以空字符串的形式出现?您必须尝试此方法,如果不起作用,请发布另一个问题。是否有方法将“sharper_task | man_Vince-1_0_king”拆分为“sharper”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“”、“?