Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/ruby-on-rails-4/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何更改此ngrams_practice函数以返回任何n gram而不仅仅是Bigram?_Python_Nltk_N Gram - Fatal编程技术网

Python 如何更改此ngrams_practice函数以返回任何n gram而不仅仅是Bigram?

Python 如何更改此ngrams_practice函数以返回任何n gram而不仅仅是Bigram?,python,nltk,n-gram,Python,Nltk,N Gram,当n=2时,我需要如下所示的输出,在我的代码中包含“bigrams”之前,我无法使这个函数像这样返回。我需要它像这样工作,但是对于任何n的值,所以不仅仅是bigrams,而是trigrams等等。它只在n=2时工作。有什么建议吗 import nltk from nltk.tokenize import word_tokenize from nltk.util import ngrams from nltk.lm.preprocessing import pad_both_ends from

当n=2时,我需要如下所示的输出,在我的代码中包含“bigrams”之前,我无法使这个函数像这样返回。我需要它像这样工作,但是对于任何n的值,所以不仅仅是bigrams,而是trigrams等等。它只在n=2时工作。有什么建议吗

import nltk
from nltk.tokenize import word_tokenize
from nltk.util import ngrams 
from nltk.lm.preprocessing import pad_both_ends
from nltk.util import bigrams

input1 = [['A', 'B', 'C', 'D', 'E'],
      ['D', 'E', 'C', 'D', 'E'],
      ['A', 'C', 'D', 'D']]

def ngrams_practice(n, input1):    
    test_ngrams = []
    for i in range(len(input1)-n+1):
        test_ngrams2 = list(bigrams(pad_both_ends(input1[i], n)))
        test_ngrams.append(test_ngrams2)
    return test_ngrams

ngrams_practice(2,input1)
输出:
[A],[A],[B],[C],[D],[D],[E],[E],[E],[,
[('D'),('D','E'),('E','C'),('C','D'),('D','E'),('E','
Output:
[[('<s>', 'A'), ('A', 'B'), ('B', 'C'), ('C', 'D'), ('D', 'E'), ('E', '</s>')],
 [('<s>', 'D'), ('D', 'E'), ('E', 'C'), ('C', 'D'), ('D', 'E'), ('E', '</s>')]]