基于另一个文件(Python)从文件中删除短语
如何在python中实现这一点 包含错误短语基于另一个文件(Python)从文件中删除短语,python,Python,如何在python中实现这一点 包含错误短语 Go away Don't do that Stop it I don't know why you do that. Go away. I was wondering what you were doing. You seem nice allphrases.txt包含 Go away Don't do that Stop it I don't know why you do that. Go away. I was wondering wh
Go away
Don't do that
Stop it
I don't know why you do that. Go away.
I was wondering what you were doing.
You seem nice
allphrases.txt包含
Go away
Don't do that
Stop it
I don't know why you do that. Go away.
I was wondering what you were doing.
You seem nice
我希望allphrases.txt清除badphages.txt中的行
这在bash中是微不足道的
cat badfiles.txt | while read b
do
cat allphrases.txt | grep -v "$b" > tmp
cat tmp > allphrases.txt
done
哦,你以为我没看过也没试过。我找了一个多小时
这是我的密码:
# Files
ttv = "/tmp/tv.dat"
tmp = "/tmp/tempfile"
bad = "/tmp/badshows"
坏文件已存在…这里的代码创建ttv
# Function grep_v
def grep_v(f,str):
file = open(f, "r")
for line in file:
if line in str:
return True
return False
t = open(tmp, 'w')
tfile = open(ttv, "r")
for line in tfile:
if not grep_v(bad,line):
t.write(line)
tfile.close
t.close
os.rename(tmp, ttv)
首先,谷歌如何用python读取文件: 您可能会得到如下结果: 使用此选项读取列表中的两个文件
with open('badphrases.txt') as f:
content = f.readlines()
badphrases = [x.strip() for x in content]
with open('allphrases.txt') as f:
content = f.readlines()
allphrases = [x.strip() for x in content]
现在,两个内容都在列表中
迭代所有短语,检查其中是否存在来自不良短语的短语
在这点上,你可以考虑谷歌:
- 如何迭代python列表
- 如何检查另一个字符串python中是否存在字符串
for line in allphrases:
flag = True
for badphrase in badphrases:
if badphrase in line:
flag = False
break
if flag:
print(line)
如果您能理解此代码,则会注意到需要将“打印”替换为“输出到文件”:
- 现在谷歌如何打印到文件python
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import feedparser, os, re
# Files
h = os.environ['HOME']
ttv = h + "/WEB/Shows/tv.dat"
old = h + "/WEB/Shows/old.dat"
moo = h + "/WEB/Shows/moo.dat"
tmp = h + "/WEB/Shows/tempfile"
bad = h + "/WEB/Shows/badshows"
# Function not_present
def not_present(f,str):
file = open(f, "r")
for line in file:
if str in line:
return False
return True
# Sources (shortened)
sources = ['http://predb.me/?cats=tv&rss=1']
# Grab all the feeds and put them into ttv and old
k = open(old, 'a')
f = open(ttv, 'a')
for h in sources:
d = feedparser.parse(h)
for post in d.entries:
if not_present(old,post.link):
f.write(post.title + "|" + post.link + "\n")
k.write(post.title + "|" + post.link + "\n")
f.close
k.close
# Remove shows without [Ss][0-9] and put them in moo
m = open(moo, 'a')
t = open(tmp, 'w')
file = open(ttv, "r")
for line in file:
if re.search(r's[0-9]', line, re.I) is None:
m.write(line)
# print("moo", line)
else:
t.write(line)
# print("tmp", line)
t.close
m.close
os.rename(tmp, ttv)
# Remove badshows
t = open(tmp, 'w')
with open(bad) as f:
content = f.readlines()
bap = [x.strip() for x in content]
with open(ttv) as f:
content = f.readlines()
all = [x.strip() for x in content]
for line in all:
flag = True
for b in bap:
if b in line:
flag = False
break
if flag:
t.write(line + "\n")
t.close
os.rename(tmp, ttv)
不妨让用户在谷歌上搜索“如何用python替换文件中的行”。可能有100种方法可以做到这一点。显然,他/她正在努力学习python。所以给一些基本的提示。有时候用另一种语言做事情会简单得多。不管python中的解决方案是什么,它肯定比它需要的复杂得多。我不明白为什么python很受欢迎。@W.Hunk如果badfiles.txt文件实际上是通过API提供的呢?您的问题非常简单,因此shell脚本可以更好更快地解决它。Python可以编写非常复杂的web应用程序,您可以在shell中完成吗?(也许,但在python中没有那么安全和容易)。应用程序问题不是用python/C/C++/Shell/Perl/Java解决的。如果您了解这些语言中每种语言的基础知识,那么可以根据具体情况轻松选择使用哪种语言。你会有一个很好的理由。希望这能有所帮助。看看python到底没那么糟糕。此外,您还可以改进不存在的功能。目前它每次都在读取文件。读取文件一次并存储在列表中。调用该方法时,请从该列表中进行检查。