Python 从for循环写入Html文件';s输出
我编写了一个Python WebScraper,如下所示:Python 从for循环写入Html文件';s输出,python,selenium,Python,Selenium,我编写了一个Python WebScraper,如下所示: from selenium import webdriver from BeautifulSoup import BeautifulSoup wd = webdriver.Firefox() wd.get('http://www.nseindia.com/live_market/dynaContent/live_analysis/top_gainers_losers.htm?cat=G&utm_campaign=website
from selenium import webdriver
from BeautifulSoup import BeautifulSoup
wd = webdriver.Firefox()
wd.get('http://www.nseindia.com/live_market/dynaContent/live_analysis/top_gainers_losers.htm?cat=G&utm_campaign=website&utm_source=sendgrid.com&utm_medium=email')
html_page = wd.page_source
wd.quit()
soup = BeautifulSoup(html_page)
table = soup.find("table", attrs = {"id":"topGainers"})
print "success"
#print table
for row in table.findAll('tr')[1:]:
cols = row.findAll('td')
#print cols
#break
some = [cols[0], cols[5], cols[6], cols[9]]
#print some
#break
for td in some:
if td.find(text = True):
text = ''.join(td.find(text = True))
print text + "|"
else:
continue
现在,我希望我的输出(文本)是一个表格格式的html文件。我如何才能做到这一点???在最后一个if块中,将添加到for循环之前声明的字符串中。以“heading1heading2”等形式开始串,并在底部if块中建立它。
#Okay first if you want the table to have HEADERS above each column you should save the heading names in a list like so...
listofheaders=['header1','header2','header3']
#for each row in the table save the data included in the row in a list of lists something like this:
listofrows=[['a','b','c'],['a','b','c'],['a','b','c']]
#now create a string with the following:
htmlstuff='<!DOCTYPE html>\n<html>\n<head>\n<style>\ntable,th,td\n{\nborder:1px solid black;\nborder-collapse:collapse;\n}\nth,td\n{\npadding:5px;\n}\n</style>\n</head>\n\n<body>\n<table style="width:300px">\n<tr>\n '
#now you would add the COLUMN HEADERS to the list...
for header in listofheaders:
htmlstuff=htmlstuff+'<th>'+str(header)+'</th>\n'
#then you can populate the table row by row...
for row in listofrows:
htmlstuff+=' <tr>\n'
for item in row:
htmlstuff=htmlstuff+' <td>'+str(item)+'</td>\n'
htmlstuff+=' </tr>\n'
#finish off the html coding...
htmlstuff+='</table>\n</body>\n\n</html>'
#now create the html page and write the data...
f=open('webpage.html','w')
f.write(htmlstuff)
f.close()
import webbrowser
webbrowser.open('webpage.html')