Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/string/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在文本python中出现字符串_Python_String - Fatal编程技术网

在文本python中出现字符串

在文本python中出现字符串,python,string,Python,String,有很多关于python中出现子字符串的帖子,但是我找不到任何关于文本中出现字符串的内容 testSTR = "Suppose you have a large text and you are trying to find the specific occurences of some words" #Suppose my search term is a, then I would expect the output of my program to be: print testSTR.my

有很多关于python中出现子字符串的帖子,但是我找不到任何关于文本中出现字符串的内容

testSTR = "Suppose you have a large text and you are trying to find the specific occurences of some words"

#Suppose my search term is a, then I would expect the output of my program to be:
print testSTR.myfunc("a")
>>1
因为在整个输入中只有一个对字符串a的具体引用。count不起作用,因为它也计算子字符串,所以我得到的输出是:

print testSTR.count()
>>3

可以这样做吗?

您可以在拆分字符串后使用集合来完成此操作

from collections import Counter
print Counter(testSTR.split())
输出看起来像

Counter({'you': 2, 'a': 1, 'and': 1, 'words': 1, 'text': 1, 'some': 1, 'the': 1, 'large': 1, 'to': 1, 'Suppose': 1, 'are': 1, 'have': 1, 'of': 1, 'specific': 1, 'trying': 1, 'find': 1, 'occurences': 1})
要获取使用的特定子字符串的计数

如果计数需要不区分大小写,请在计数前使用upper或lower转换子字符串

res= Counter(i.lower() for i in testSTR.split())

如果您关心标点符号,您应该尝试以下方法:

words = testSTR.split().map(lambda s: s.strip(".!?:;,\"'"))
print "a" in words

我认为最直接的方法是使用正则表达式:

import re
testSTR = "Suppose you have a large text and you are trying to find the specific occurences of some words"

print len(re.findall(r"\ba\b", testSTR))
# 1
\ba\b检查a前后的单词边界,其中单词边界是标点、空格或整个字符串的开头或结尾。这比在空格上拆分更有用,当然,除非这是您想要的

import re
str2 = "a large text a, a. a"

print len(re.findall(r"\ba\b", str2))
# 4

你能展示一下你的myfunc吗?你说的混凝土是什么意思?在您的输入中有很多对字符串a的引用,您可能想搜索单词a吗?我不关心标点符号,我只想在整个代码中找到a的数字。
import re
str2 = "a large text a, a. a"

print len(re.findall(r"\ba\b", str2))
# 4