在文本python中出现字符串_Python_String

在文本python中出现字符串

python string

在文本python中出现字符串,python,string,Python,String,有很多关于python中出现子字符串的帖子，但是我找不到任何关于文本中出现字符串的内容 testSTR = "Suppose you have a large text and you are trying to find the specific occurences of some words" #Suppose my search term is a, then I would expect the output of my program to be: print testSTR.my

有很多关于python中出现子字符串的帖子，但是我找不到任何关于文本中出现字符串的内容

testSTR = "Suppose you have a large text and you are trying to find the specific occurences of some words"

#Suppose my search term is a, then I would expect the output of my program to be:
print testSTR.myfunc("a")
>>1

因为在整个输入中只有一个对字符串a的具体引用。count不起作用，因为它也计算子字符串，所以我得到的输出是：

print testSTR.count()
>>3

可以这样做吗？

您可以在拆分字符串后使用集合来完成此操作

from collections import Counter
print Counter(testSTR.split())

输出看起来像

Counter({'you': 2, 'a': 1, 'and': 1, 'words': 1, 'text': 1, 'some': 1, 'the': 1, 'large': 1, 'to': 1, 'Suppose': 1, 'are': 1, 'have': 1, 'of': 1, 'specific': 1, 'trying': 1, 'find': 1, 'occurences': 1})

要获取使用的特定子字符串的计数

如果计数需要不区分大小写，请在计数前使用upper或lower转换子字符串

res= Counter(i.lower() for i in testSTR.split())

如果您关心标点符号，您应该尝试以下方法：

words = testSTR.split().map(lambda s: s.strip(".!?:;,\"'"))
print "a" in words

我认为最直接的方法是使用正则表达式：

import re
testSTR = "Suppose you have a large text and you are trying to find the specific occurences of some words"

print len(re.findall(r"\ba\b", testSTR))
# 1

\ba\b检查a前后的单词边界，其中单词边界是标点、空格或整个字符串的开头或结尾。这比在空格上拆分更有用，当然，除非这是您想要的

import re
str2 = "a large text a, a. a"

print len(re.findall(r"\ba\b", str2))
# 4

你能展示一下你的myfunc吗？你说的混凝土是什么意思？在您的输入中有很多对字符串a的引用，您可能想搜索单词a吗？我不关心标点符号，我只想在整个代码中找到a的数字。

import re
str2 = "a large text a, a. a"

print len(re.findall(r"\ba\b", str2))
# 4