Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 查找词序_Python_Python 3.x - Fatal编程技术网

Python 查找词序

Python 查找词序,python,python-3.x,Python,Python 3.x,我想检查文本中是否有单词序列,查看单词列表: word_list=“从不”、“不”、“购买”、“在这里”、“再次”、“更多”、“你好”、“不”、“将”、“桌子” text=“我愿意在这里购买更多” 预期产量:不会在这里购买更多 但不是: 威尔(重复序列) 不会(序列不完整) 我知道(用非常小的单词排序) 我的剧本: word_list = "never", "not", "buy", "here", "ag

我想检查文本中是否有单词序列,查看单词列表:

word_list=“从不”、“不”、“购买”、“在这里”、“再次”、“更多”、“你好”、“不”、“将”、“桌子”

text=“我愿意在这里购买更多”

预期产量:不会在这里购买更多

但不是:

威尔(重复序列)

不会(序列不完整)

我知道(用非常小的单词排序)

我的剧本:

word_list = "never", "not", "buy", "here", "again", "more", "hello", "not", "will"
text = "I do will will not buy more here"

text = text.split(" ")

sequences = []
counter = 0
for words in text:
    for word in word_list:
        if word in text:
            sequences.append(word)
            counter =+ counter
        
            # to avoid meaningless sequences like (incomplete sequence): "will not", "I will", "more here"...
            sequences_two_words = []
            for sequence in sequences:
                if len(sequence) <= 2:
                    pass
                else:
                    sequences_two_words.append(sequence)
                
            # to avoid sequences like (repeated sequence): "will will"
            sequences_not_repeat = []
            for not_repeat in sequences_two_words:
                if not_repeat[0] == not_repeat[1]:
                    pass
                else:
                    sequences_not_repeat.append(not_repeat)

            # to avoid sequences like (sequence with very small words): "I do"
            sequences_not_little = []
            for little_len in sequences_not_repeat:
                if len(little_len[1]) <= 2:
                    pass
                else:
                    sequences_not_little.append(little_len)


    print(sequences_not_little)
word\u list=“从不”、“不”、“买”、“在这里”、“再次”、“更多”、“你好”、“不”、“会”
text=“我愿意在这里购买更多”
text=text.split(“”)
序列=[]
计数器=0
对于文本中的单词:
对于word\u列表中的word:
如果文本中有单词:
sequences.append(word)
计数器=+计数器
#为了避免像(不完整序列)这样毫无意义的序列:“不会”、“我会”、“这里更多”。。。
序列两个单词=[]
对于顺序中的顺序:
if len(序列)
word_list = "never", "not", "buy", "here", "again", "more", "hello", "not", "will", "table"

text = "I do will will not buy more here"

text_split = text.lower().split(" ")

sequences = []
sequence = ()
prev = False

for word in text_split:
    if word in word_list:
        # len(word) > 2 removes: I do (sequence with very small words)
        # prev != word removes: [will will (repeated sequence)]

        if len(word) > 2 and prev != word: 
            sequence += (word, )
    else:
        if len(sequence) > 2: # removes: will not (incomplete sequence)

            sequences.append(sequence)
            sequence = ()

    prev = word

if len(sequence) > 2:
    sequences.append(sequence)

print(sequences) # array sequences you want