无法在每行python中查找DWORD_Python

无法在每行python中查找DWORD

python

无法在每行python中查找DWORD,python,Python,我的文本文件是这样的 RT @Vevo The @5SoS world is turned upside down in Want You Back the first release in 2 years by the Aussie poppunk band RT @Jeff__Benjamin Congratulations to @Stray_Kids for making their debut on @Billboard's Social 50 chart this week at

我的文本文件是这样的

RT @Vevo The @5SoS world is turned upside down in Want You Back the first release in 2 years by the Aussie poppunk band 
RT @Jeff__Benjamin Congratulations to @Stray_Kids for making their debut on @Billboard's Social 50 chart this week at No 45 Very promisi
RT @Vevo The @5SoS world is turned upside down in Want You Back the first release in 2 years by the Aussie poppunk band 
RT @thesheetztweetz Fun fact @relativityspace CEO Tim Ellis shared
The song is by @bassnectar and when he saw this video he told Relat
RT @Jeff__Benjamin Congratulations to @Stray_Kids for making their debut on @Billboard's Social 50 chart this week at No 45 Very promisi
RT @fringeflowers What a Beautiful Way to Express Your Spirituality with this #SterlingSilver Fancy Vented Band with #Sanskrit The Langua

我的代码文件如下所示

import re

wordlist = ["soundigest","vile","paris" ,"carlyaquilino","chrispolanco13","bimbo's","mcr","jack","lauren_hoggs","siriusxm","force","7th","muz4now","christ","orchestra","100","rampb","gla"]

data = ""
counter=0
with open("musicData.txt","w") as fout:
    print "hi"
    with open("temp.txt") as fin:
        for line in fin:
            for term in line.split():
                term = term.lower()
                term= re.sub('[\n]+', ' ', term)
                # Remove not alphanumeric symbols white spaces
                term = re.sub(r'[^\w]', ' ', term)
                # Replace #word with word
                term = re.sub(r'#([^\s]+)', r'\1', term)
                # Remove :( or :)
                term = term.replace(':)', '')
                term = term.replace(':(', '')
                # trim
                term = term.strip('\'"')
                if term in wordlist:
                    data = data + term + ","
                    print data
            if data !="":
                fout.write(data)
                #print "data to write :", data
                fout.write("\n") 
            data =""

我的目标是，我想从单词列表中找到每个包含单词的行数组。因此，假设第一行中有两个单词来自单词列表，然后它将在新文件中打印数组，如[vile，paris]。我无法处理当前代码

我觉得您让它变得比应该的更复杂了

如果你喜欢这个怎么办

with open("temp.txt") as fin:
        for line in fin:
            for word in wordlist:
                if word in line:
                    # or data = data + line + "," - depends what you want to get
                    data = data + word + ","
                    print data

这将使“克里斯托夫”与“基督”相匹配。但是可以将单词列表编译成一个regexp类型

r“\b（escaped1 | escaped2）\b”

，并仅在单词边界上与regexp匹配。