Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 检查字符串列表以提取特定单词的有效方法_Python - Fatal编程技术网

Python 检查字符串列表以提取特定单词的有效方法

Python 检查字符串列表以提取特定单词的有效方法,python,Python,我试图检查20000个字符串列表,并与某些单词/短语进行比较,以便将它们正确地分为3类 以下是字符串的示例列表: sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"] 所以我想检查字符串是否有: "empty" and "bus" and "empty" then empty

我试图检查20000个字符串列表,并与某些单词/短语进行比较,以便将它们正确地分为3类

以下是字符串的示例列表:

  sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"]
所以我想检查字符串是否有:

    "empty" and "bus" and "empty" then emptyCount += 1

    "order canceled" or "canceled" then cancelcount += 1

    "empty" or "site" or "no empty on site" then site += 1
我有一个代码可以做到这一点,但我不认为它更有效,而且可能实际上遗漏了一些关键点。关于如何进行这件事,有什么建议吗

    site = 0
    cancel = 0
    empty = 0
    count = 0
    for i in sample:
        if "empty" and "bus" and "empty" in i:
           emptycount += 1
        elif "order canceled" or "canceled":
           cancelcount += 1
        elif "empty" or "site" or "no empty on site" 
           site += 1

        else:
           count += 1

你甚至不需要提取

您所需要做的就是搜索和递增计数

sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"]

empty_counter = 0
for string_item in sample:
    if 'empty' in string_item:
        empty_counter += 1

print(empty_counter)
如果你想要的是效率,那么我建议你使用熊猫。这将根据数据的大小将您的效率提高100倍,这是一个数据科学包,意味着它可以非常快速地处理数百万数据

#import pandas package.
import pandas as pd

sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"]

# create a pandas series
sr = pd.Series(sample) 

#search for match and store results 
results = sr.str.match(pat = '(empty)&(bus)' )

#gives total number of matching items
print(results.shape[0])

你甚至不需要提取

您所需要做的就是搜索和递增计数

sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"]

empty_counter = 0
for string_item in sample:
    if 'empty' in string_item:
        empty_counter += 1

print(empty_counter)
如果你想要的是效率,那么我建议你使用熊猫。这将根据数据的大小将您的效率提高100倍,这是一个数据科学包,意味着它可以非常快速地处理数百万数据

#import pandas package.
import pandas as pd

sample = ["the empty bus behind me", "the facility is close", "my order was canceled", "no empty on site", "no bus for me to move"]

# create a pandas series
sr = pd.Series(sample) 

#search for match and store results 
results = sr.str.match(pat = '(empty)&(bus)' )

#gives total number of matching items
print(results.shape[0])

你能分享你当前使用的代码吗?“我有一个这样做的代码”-请展示它并解释它如何无效。如果你想让它更快,你可以使用ThreadsOkay,我现在就编辑并输入我的代码/。。感谢您不必使用count,如果您想知道列表中有多少字符串,请使用
len(示例)
您可以共享您当前使用的代码吗?“我有一个这样做的代码”-请展示它并解释它是如何无效的。如果您想让它更快,您可以使用ThreadsOkay,我现在就编辑并输入我的代码/。。谢谢你不必使用count,如果你想知道列表中有多少个字符串,只需使用
len(sample)
谢谢,我知道这一点,只是在想是否有什么方法可以有效地实现它。谢谢,我知道这一点,只是在想是否有什么方法可以有效地进行搜索。谢谢你,这是另一种寻找你的有效方法。