无法在每行python中查找DWORD
我的文本文件是这样的无法在每行python中查找DWORD,python,Python,我的文本文件是这样的 RT @Vevo The @5SoS world is turned upside down in Want You Back the first release in 2 years by the Aussie poppunk band RT @Jeff__Benjamin Congratulations to @Stray_Kids for making their debut on @Billboard's Social 50 chart this week at
RT @Vevo The @5SoS world is turned upside down in Want You Back the first release in 2 years by the Aussie poppunk band
RT @Jeff__Benjamin Congratulations to @Stray_Kids for making their debut on @Billboard's Social 50 chart this week at No 45 Very promisi
RT @Vevo The @5SoS world is turned upside down in Want You Back the first release in 2 years by the Aussie poppunk band
RT @thesheetztweetz Fun fact @relativityspace CEO Tim Ellis shared
The song is by @bassnectar and when he saw this video he told Relat
RT @Jeff__Benjamin Congratulations to @Stray_Kids for making their debut on @Billboard's Social 50 chart this week at No 45 Very promisi
RT @fringeflowers What a Beautiful Way to Express Your Spirituality with this #SterlingSilver Fancy Vented Band with #Sanskrit The Langua
我的代码文件如下所示
import re
wordlist = ["soundigest","vile","paris" ,"carlyaquilino","chrispolanco13","bimbo's","mcr","jack","lauren_hoggs","siriusxm","force","7th","muz4now","christ","orchestra","100","rampb","gla"]
data = ""
counter=0
with open("musicData.txt","w") as fout:
print "hi"
with open("temp.txt") as fin:
for line in fin:
for term in line.split():
term = term.lower()
term= re.sub('[\n]+', ' ', term)
# Remove not alphanumeric symbols white spaces
term = re.sub(r'[^\w]', ' ', term)
# Replace #word with word
term = re.sub(r'#([^\s]+)', r'\1', term)
# Remove :( or :)
term = term.replace(':)', '')
term = term.replace(':(', '')
# trim
term = term.strip('\'"')
if term in wordlist:
data = data + term + ","
print data
if data !="":
fout.write(data)
#print "data to write :", data
fout.write("\n")
data =""
我的目标是,我想从单词列表中找到每个包含单词的行数组。因此,假设第一行中有两个单词来自单词列表,然后它将在新文件中打印数组,如[vile,paris]。我无法处理当前代码我觉得您让它变得比应该的更复杂了 如果你喜欢这个怎么办
with open("temp.txt") as fin:
for line in fin:
for word in wordlist:
if word in line:
# or data = data + line + "," - depends what you want to get
data = data + word + ","
print data
这将使“克里斯托夫”与“基督”相匹配。但是可以将单词列表编译成一个regexp类型
r“\b(escaped1 | escaped2)\b”
,并仅在单词边界上与regexp匹配。