Python 检查字符串列表以提取特定单词的有效方法
我试图检查20000个字符串列表,并与某些单词/短语进行比较,以便将它们正确地分为3类 以下是字符串的示例列表:Python 检查字符串列表以提取特定单词的有效方法,python,Python,我试图检查20000个字符串列表,并与某些单词/短语进行比较,以便将它们正确地分为3类 以下是字符串的示例列表: sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"] 所以我想检查字符串是否有: "empty" and "bus" and "empty" then empty
sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"]
所以我想检查字符串是否有:
"empty" and "bus" and "empty" then emptyCount += 1
"order canceled" or "canceled" then cancelcount += 1
"empty" or "site" or "no empty on site" then site += 1
我有一个代码可以做到这一点,但我不认为它更有效,而且可能实际上遗漏了一些关键点。关于如何进行这件事,有什么建议吗
site = 0
cancel = 0
empty = 0
count = 0
for i in sample:
if "empty" and "bus" and "empty" in i:
emptycount += 1
elif "order canceled" or "canceled":
cancelcount += 1
elif "empty" or "site" or "no empty on site"
site += 1
else:
count += 1
你甚至不需要提取 您所需要做的就是搜索和递增计数
sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"]
empty_counter = 0
for string_item in sample:
if 'empty' in string_item:
empty_counter += 1
print(empty_counter)
如果你想要的是效率,那么我建议你使用熊猫。这将根据数据的大小将您的效率提高100倍,这是一个数据科学包,意味着它可以非常快速地处理数百万数据
#import pandas package.
import pandas as pd
sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"]
# create a pandas series
sr = pd.Series(sample)
#search for match and store results
results = sr.str.match(pat = '(empty)&(bus)' )
#gives total number of matching items
print(results.shape[0])
你甚至不需要提取 您所需要做的就是搜索和递增计数
sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"]
empty_counter = 0
for string_item in sample:
if 'empty' in string_item:
empty_counter += 1
print(empty_counter)
如果你想要的是效率,那么我建议你使用熊猫。这将根据数据的大小将您的效率提高100倍,这是一个数据科学包,意味着它可以非常快速地处理数百万数据
#import pandas package.
import pandas as pd
sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"]
# create a pandas series
sr = pd.Series(sample)
#search for match and store results
results = sr.str.match(pat = '(empty)&(bus)' )
#gives total number of matching items
print(results.shape[0])
你能分享你当前使用的代码吗?“我有一个这样做的代码”-请展示它并解释它如何无效。如果你想让它更快,你可以使用ThreadsOkay,我现在就编辑并输入我的代码/。。感谢您不必使用count,如果您想知道列表中有多少字符串,请使用
len(示例)
您可以共享您当前使用的代码吗?“我有一个这样做的代码”-请展示它并解释它是如何无效的。如果您想让它更快,您可以使用ThreadsOkay,我现在就编辑并输入我的代码/。。谢谢你不必使用count,如果你想知道列表中有多少个字符串,只需使用len(sample)
谢谢,我知道这一点,只是在想是否有什么方法可以有效地实现它。谢谢,我知道这一点,只是在想是否有什么方法可以有效地进行搜索。谢谢你,这是另一种寻找你的有效方法。