Python 循环浏览URL列表,运行BeautifulSoup,写入文件
我有一个要运行的URL列表,使用BeautifulSoup清理并保存到.txt文件Python 循环浏览URL列表,运行BeautifulSoup,写入文件,python,for-loop,writefile,Python,For Loop,Writefile,我有一个要运行的URL列表,使用BeautifulSoup清理并保存到.txt文件 import urllib from bs4 import BeautifulSoup x = ["https://www.sec.gov/Archives/edgar/data/1000298/0001047469-13-002555.txt", "https://www.sec.gov/Archives/edgar/data/1001082/0001104659-13-011967.txt"] for
import urllib
from bs4 import BeautifulSoup
x = ["https://www.sec.gov/Archives/edgar/data/1000298/0001047469-13-002555.txt",
"https://www.sec.gov/Archives/edgar/data/1001082/0001104659-13-011967.txt"]
for url in x:
#I want to open the URL listed in my list
fp = urllib.request.urlopen(url)
test = fp.read()
soup = BeautifulSoup(test,"lxml")
output=soup.get_text()
#and then save the get_text() results to a unique file.
file=open("url.txt","w",encoding='utf-8')
file.write(output)
file.close()
这是我现在的代码,列表中只有几个项目,txt文件中会有更多的项目,但现在它保持简单
循环工作时,它将两个URL的输出传递到URL.txt文件。我希望列表中的每个实例都输出到其唯一的.txt文件中
import urllib
from bs4 import BeautifulSoup
x = ["https://www.sec.gov/Archives/edgar/data/1000298/0001047469-13-002555.txt",
"https://www.sec.gov/Archives/edgar/data/1001082/0001104659-13-011967.txt"]
for url in x:
#I want to open the URL listed in my list
fp = urllib.request.urlopen(url)
test = fp.read()
soup = BeautifulSoup(test,"lxml")
output=soup.get_text()
#and then save the get_text() results to a unique file.
file=open("url.txt","w",encoding='utf-8')
file.write(output)
file.close()
谢谢你看。最好是George为列表中的每个项目创建不同的文件名,如下所示:
import urllib
from bs4 import BeautifulSoup
x = ["https://www.sec.gov/Archives/edgar/data/1000298/0001047469-13-002555.txt",
"https://www.sec.gov/Archives/edgar/data/1001082/0001104659-13-011967.txt"]
for index , url in enumerate(x):
#I want to open the URL listed in my list
fp = urllib.request.urlopen(url)
test = fp.read()
soup = BeautifulSoup(test,"lxml")
output=soup.get_text()
#and then save the get_text() results to a unique file.
file=open("url%s.txt" % index,"w",encoding='utf-8')
file.write(output)
file.close()
你能在enumerate(x)中为i,url执行
操作吗?
并使用i
构建url文件名?我会这样做的!下面的解释很好。