Python:检查文本文件中每行ngram的出现次数
所以我有一个文件,其中有一个ngram列表,由换行符分隔。看起来是这样的:Python:检查文本文件中每行ngram的出现次数,python,Python,所以我有一个文件,其中有一个ngram列表,由换行符分隔。看起来是这样的: got to love makes perfect sense going to be would have guessed can not wait more important than I got to love you. Hello world Well boy That makes perfect sense. I can not wait. Hello 我还有一个文本文件,其中有几行句子,也由换行符分隔。
got to love
makes perfect sense
going to be
would have guessed
can not wait
more important than
I got to love you.
Hello world
Well boy
That makes perfect sense. I can not wait.
Hello
我还有一个文本文件,其中有几行句子,也由换行符分隔。我们可以这样说:
got to love
makes perfect sense
going to be
would have guessed
can not wait
more important than
I got to love you.
Hello world
Well boy
That makes perfect sense. I can not wait.
Hello
我希望能够遍历每一行,并计算这些ngram作为一个整体出现的次数。因此,我对上述内容的输出为:
1
0
0
2
0
我该如何做到这一点呢?剩下的就交给你了
for sentence in sentences:
count = 0
for ngram in ngrams:
if ngram in sentence:
count += 1
print count
剩下的就交给你了
for sentence in sentences:
count = 0
for ngram in ngrams:
if ngram in sentence:
count += 1
print count
我曾经尝试过这个方法,但是它返回了所有的0
def checkLine(line):count=0和open(“tr_response\u trigrams\u list.txt”)作为f:for ngram in f:if ngram in line:count+=1 print count for line in open(“/Users/user/code/abstract/data/Training(3500)/3500_response\u Tweets.txt”,“r”):checkLine(行)
我不知道你的程序应该如何工作为什么第4行中有一个2?因为它同时包含“非常有意义”和“不能等待”,所以在你的问题中包含你的代码总是好的。它的可读性和其他功能可以帮助你做得更好我已经尝试过了,但它会返回所有的零def checkLine(行):count=0打开(“tr_response_trigrams_list.txt”)作为f:f中的ngram:if-ngram-in-line:count+=1打开行的打印计数(“/Users/user/Code/abstract/data/Training(3500)/3500_response_-Tweets.txt”,“r”):选中行(line)
我不知道你的程序应该如何工作为什么第4行中有一个2?因为它既包含“非常有意义”又包含“不能等待”,所以在你的问题中包含你的代码总是很好的。它可读性好,其他人可以帮助你更好