Python 文本中的多词一致性_Python_Python 3.x

Python 文本中的多词一致性

python python-3.x

Python 文本中的多词一致性,python,python-3.x,Python,Python 3.x,我有一个words文件来查找文本中最多3个左右位置的一致性 Word文件：买时间玻璃家红色的文本文件：在网上销售了近十年的食品杂货后，亚马逊自己也未能取得重大进展，因为消费者表现出强烈的购买欲望，比如亲自购买水果、蔬菜和肉类。你是否发现自己花在手机上的时间比你花在。。。经常盯着你的手机漫无目的地打发时间？这一过程可能充满焦虑，因为有许多不同的玻璃风格可供选择，并且在什么是适当和必要的问题上观点冲突脚本： def keywordsContext(file, fileName):

我有一个words文件来查找文本中最多3个左右位置的一致性

Word文件：

买

时间

玻璃

家

红色的

文本文件：

在网上销售了近十年的食品杂货后，亚马逊自己也未能取得重大进展，因为消费者表现出强烈的购买欲望，比如亲自购买水果、蔬菜和肉类。你是否发现自己花在手机上的时间比你花在。。。经常盯着你的手机漫无目的地打发时间？这一过程可能充满焦虑，因为有许多不同的玻璃风格可供选择，并且在什么是适当和必要的问题上观点冲突

脚本：

def keywordsContext(file, fileName):
    #file: text file
    #fileName: words file

    with open(file, "r") as f, open(fileName, "r") as fi:

        corpus = f.read().split()
        pivot = fi.read().split()

        for keywords in pivot:
            if keywords in corpus:
                index = pivot.index(keywords)
                contexts = keywords+":", pivot[index-3:index], pivot[index+1:index+4]
                print(contexts)
            else:
                pass

输出：

“买：”、[]、[“时间”、“杯子”、“家”]

“时间：”、[]、[“玻璃”、“家”、“红色”]

“玻璃：”、[]、[“家”、“红色”]

没有

我想要的输出：

“买”：固执地想买水果之类的东西

“时间”：你自己花更多的时间在手机上

“玻璃”：可以使用多种不同的玻璃样式

编辑

而且。。。如果同一个单词出现多次？我用语料库中的一句话做了一个测试，以重复单词“glass”。我试着放了一段时间len corpus！=0，但它是一个重复的循环，具有相同的输出

def keywordsContext(file, fileName):

    with open(file, "r") as f, open(fileName, "r") as fi:

        corpus = f.read().split()
        pivot = fi.read().split()

        while len(corpus) != 0:

            for keywords in pivot:
                if keywords in corpus:
                    inde = corpus.index(keywords)
                    contexts = keywords+": "+ ' '.join(corpus[inde-3:inde+4])
                    print(contexts)

输出：

'buy' : stubborn urge to buy items like fruits,
'time' : yourself spending more time on your phone
'glass' : as many different glass styles are available,

购买：顽固的购买水果等物品的冲动

时间：你自己花更多的时间在手机上

玻璃：由于有许多不同的玻璃样式

购买：顽固的购买水果等物品的冲动

时间：你自己花更多的时间在手机上

玻璃：由于有许多不同的玻璃样式

购买：顽固的购买水果等物品的冲动

时间：你自己花更多的时间在手机上

玻璃：由于有许多不同的玻璃样式

输出：

'buy' : stubborn urge to buy items like fruits,
'time' : yourself spending more time on your phone
'glass' : as many different glass styles are available,

输出：

'buy' : stubborn urge to buy items like fruits,
'time' : yourself spending more time on your phone
'glass' : as many different glass styles are available,

名单上的名字有些错误。试试看：

def keywordsContext(file, fileName):
#file: text file
#fileName: words file

with open(file, "r") as f, open(fileName, "r") as fi:

    corpus = f.read().split()
    pivot = fi.read().split()
    for keywords in pivot:
        if keywords in corpus:
            lst_index = 0
            for i in range(0, corpus.count(keywords)):
                inde = corpus.index(keywords, lst_index)
                contexts = keywords+": "+ ' '.join(corpus[inde-3:inde+4])
                lst_index = inde+1
                print(contexts)
        else:
            pass

已编辑：根据OP edit，此程序打印所有出现的word

列表名称错误。试试看：

def keywordsContext(file, fileName):
#file: text file
#fileName: words file

with open(file, "r") as f, open(fileName, "r") as fi:

    corpus = f.read().split()
    pivot = fi.read().split()
    for keywords in pivot:
        if keywords in corpus:
            lst_index = 0
            for i in range(0, corpus.count(keywords)):
                inde = corpus.index(keywords, lst_index)
                contexts = keywords+": "+ ' '.join(corpus[inde-3:inde+4])
                lst_index = inde+1
                print(contexts)
        else:
            pass

def keywordsContext(file, fileName):

    with open(file, "r") as f, open(fileName, "r") as fi:

        corpus = f.read().split()
        pivot = fi.read().split()
        for keywords in pivot:
            if keywords in corpus:
                index = corpus.index(keywords)
                contexts = keywords+":", corpus[index-3:index+4]
                print(contexts)
            else:
                pass

编辑：根据OP edit，此程序打印所有出现的word

def keywordsContext(file, fileName):

    with open(file, "r") as f, open(fileName, "r") as fi:

        corpus = f.read().split()
        pivot = fi.read().split()
        for keywords in pivot:
            if keywords in corpus:
                index = corpus.index(keywords)
                contexts = keywords+":", corpus[index-3:index+4]
                print(contexts)
            else:
                pass

输出

('buy:', ['stubborn', 'urge', 'to', 'buy', 'items', 'like', 'fruits,'])
('time:', ['yourself', 'spending', 'more', 'time', 'on', 'your', 'phone'])
('glass:', ['as', 'many', 'different', 'glass', 'styles', 'are', 'available,'])

输出

('buy:', ['stubborn', 'urge', 'to', 'buy', 'items', 'like', 'fruits,'])
('time:', ['yourself', 'spending', 'more', 'time', 'on', 'your', 'phone'])
('glass:', ['as', 'many', 'different', 'glass', 'styles', 'are', 'available,'])

我编辑了我的文章，签出我编辑了我的文章，签出