Python 如何将NLTK一致性结果保存在列表中？_Python_Nlp_Nltk_Corpus

Python 如何将NLTK一致性结果保存在列表中？

python nlp

Python 如何将NLTK一致性结果保存在列表中？,python,nlp,nltk,corpus,Python,Nlp,Nltk,Corpus,我使用NLTK在文本中查找单词。我需要将协调函数的结果保存到列表中。问题已经提出了但我看不到变化。我试图通过以下方式找到函数的returnde值的类型： type(text.concordance('myword')) 结果是： <class 'NoneType'> 要使用文本一致性，您需要实例化NLTKtext（）对象，然后在该对象上使用concordance（）方法： import nltk.corpus from nltk.text import Text mob

我使用NLTK在文本中查找单词。我需要将协调函数的结果保存到列表中。问题已经提出了但我看不到变化。我试图通过以下方式找到函数的returnde值的类型：

type(text.concordance('myword'))

结果是：

<class 'NoneType'>

要使用文本一致性，您需要实例化NLTK

text（）

对象，然后在该对象上使用

concordance（）

方法：

import nltk.corpus  
from nltk.text import Text  
moby = Text(nltk.corpus.gutenberg.words('melville-moby_dick.txt'))

在这里，我们在文本文件

melville-moby_dick.txt

上实例化一个文本对象，然后我们可以使用以下方法：

moby.concordance("monster")

如果您在这里有一个非类型，这似乎是因为您没有创建任何

Text

对象，因此您的变量

Text

是

None

要使用文本一致性，您需要实例化一个NLTK

Text（）

对象，然后在该对象上使用

concordance（）

方法：

import nltk.corpus  
from nltk.text import Text  
moby = Text(nltk.corpus.gutenberg.words('melville-moby_dick.txt'))

在这里，我们在文本文件

melville-moby_dick.txt

上实例化一个文本对象，然后我们可以使用以下方法：

moby.concordance("monster")

如果您在这里有一个非类型，它似乎是因为您没有创建任何

Text

对象，因此您的变量

Text

是

None

通过检查的源代码，我们可以看到结果被打印到stdout。如果不是选项，则必须重新实现

ConcordanceIndex.print\u concordance

，使其返回结果，而不是将其打印到标准输出

代码：

def concordance(ci, word, width=75, lines=25):
    """
    Rewrite of nltk.text.ConcordanceIndex.print_concordance that returns results
    instead of printing them. 

    See:
    http://www.nltk.org/api/nltk.html#nltk.text.ConcordanceIndex.print_concordance
    """
    half_width = (width - len(word) - 2) // 2
    context = width // 4 # approx number of words of context

    results = []
    offsets = ci.offsets(word)
    if offsets:
        lines = min(lines, len(offsets))
        for i in offsets:
            if lines <= 0:
                break
            left = (' ' * half_width +
                    ' '.join(ci._tokens[i-context:i]))
            right = ' '.join(ci._tokens[i+1:i+context])
            left = left[-half_width:]
            right = right[:half_width]
            results.append('%s %s %s' % (left, ci._tokens[i], right))
            lines -= 1

    return results

from nltk.book import text1
from  nltk.text import ConcordanceIndex

ci = ConcordanceIndex(text1.tokens)
results = concordance(ci, 'circumstances')

print(type(results))
<class 'list'>

定义一致性（ci，字，宽度=75，行=25）： """ 重写nltk.text.ConcordanceIndex.print\u返回结果的一致性而不是打印它们。见： http://www.nltk.org/api/nltk.html#nltk.text.ConcordanceIndex.print_concordance """ 半宽度=（宽度-len（word）-2）//2 上下文=宽度//4#上下文的大约字数结果=[] 偏移量=ci.偏移量（字）如果偏移：线条=最小（线条、长度（偏移））对于偏移量中的i：

如果行通过检查源，我们可以看到结果打印到标准输出。如果不是选项，则必须重新实现

ConcordanceIndex.print\u concordance

，使其返回结果，而不是将其打印到标准输出

代码：

def concordance(ci, word, width=75, lines=25):
    """
    Rewrite of nltk.text.ConcordanceIndex.print_concordance that returns results
    instead of printing them. 

    See:
    http://www.nltk.org/api/nltk.html#nltk.text.ConcordanceIndex.print_concordance
    """
    half_width = (width - len(word) - 2) // 2
    context = width // 4 # approx number of words of context

    results = []
    offsets = ci.offsets(word)
    if offsets:
        lines = min(lines, len(offsets))
        for i in offsets:
            if lines <= 0:
                break
            left = (' ' * half_width +
                    ' '.join(ci._tokens[i-context:i]))
            right = ' '.join(ci._tokens[i+1:i+context])
            left = left[-half_width:]
            right = right[:half_width]
            results.append('%s %s %s' % (left, ci._tokens[i], right))
            lines -= 1

    return results

from nltk.book import text1
from  nltk.text import ConcordanceIndex

ci = ConcordanceIndex(text1.tokens)
results = concordance(ci, 'circumstances')

print(type(results))
<class 'list'>

一致性列表

功能。例如：

from nltk.corpus import gutenberg
from nltk.text import Text

corpus = gutenberg.words('melville-moby_dick.txt')
text = Text(corpus)
con_list = text.concordance_list("monstrous")

文本类a

concordance\u list

函数。例如：

from nltk.corpus import gutenberg
from nltk.text import Text

corpus = gutenberg.words('melville-moby_dick.txt')
text = Text(corpus)
con_list = text.concordance_list("monstrous")

我在这篇文章中看到了可能的副本，但我更喜欢避免通过文件传递。只有通过标准输出才能捕获一致性，目前还没有办法保存一致性，但有一个PR可以这样做：我在这篇文章中看到了可能的副本，但我更喜欢避免通过文件传递。一致性只能通过标准输出捕获，目前还没有办法保存一致性，但有一个公共关系可以这样做：这是人们喜欢在NLTK中看到的功能吗？如果是这样的话，请给我们一些爱，我们将看看我们可以把它推到适当的nltk函数中有多远/多快）现在有一个函数返回一个列表。请看我的回答这是人们喜欢在NLTK中看到的功能吗？如果是这样的话，请给我们一些爱，我们将看看我们可以把它推到适当的nltk函数中有多远/多快）现在有一个函数返回一个列表。我没有写完整的代码，但是文本是NLTK文本对象。我发布这行代码是为了检查concordance方法的返回类型。我没有编写完整的代码，但text是一个NLTK文本对象。我发布这行代码是为了检查concordance方法的返回类型