Python创建csv,但其为空
我终于让它创建了csv文件,但由于某种原因,它创建了标题,但从未填充数据Python创建csv,但其为空,python,csv,Python,Csv,我终于让它创建了csv文件,但由于某种原因,它创建了标题,但从未填充数据 import requests from bs4 import BeautifulSoup import csv url = "http://www.scsotx.org/jail-booking" r = requests.get(url) soup = BeautifulSoup(r.content, 'html5lib') inmate_data =[] table = soup.find('div', at
import requests
from bs4 import BeautifulSoup
import csv
url = "http://www.scsotx.org/jail-booking"
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html5lib')
inmate_data =[]
table = soup.find('div', attrs = {'class':'sex-offender-info'})
for row in table.findAll('div', attrs = {'class':'jail-content'}):
jaildata = {}
jaildata['Name'] = row.h4.text
jaildata['Agency'] = row.p.text
inmate_data.append(jail-content)
with open('C:\\Users\Cale\Desktop\jail\inmate_data.csv', 'w') as f:
w = csv.DictWriter(f,['Name','Agency'])
w.writeheader()
for jaildata in inmate_data:
w.writerow(jaildata)
它应该解析html数据,然后附加csv文件
for row in table.findAll('div', attrs = {'class':'jail-content'}):
jaildata = {}
jaildata['Name'] = row.h4.text
jaildata['Agency'] = row.p.text
inmate_data.append(jail-content)
如果查看此块,最后一行的变量
jail content
根本没有被声明。我假设您想使用jaildata
?以下是您的代码:
for row in table.findAll("figcaption", attrs={"class": "jail-content"}):
jaildata = {}
jaildata["Name"] = row.h4.text
jaildata["Agency"] = row.p.text
inmate_data.append(jaildata)
您要查找的数据是在
内部而不是
,而且在尝试追加时,还有一个拼写错误,即监狱内容而不是监狱数据。尽管我强烈建议您使用selenium进行此类操作,这是您可以用来改进刮片工作的方法:
import requests
import pandas as pd
from bs4 import BeautifulSoup
class ScrapeJail:
def __init__(self, url: str = "http://www.scsotx.org/jail-booking"):
self.url = url
def get_table(self):
response = requests.get(self.url)
soup = BeautifulSoup(response.content, 'html5lib')
raw_data = soup.find('div', attrs={'class': 'sex-off-box'})
data_contents = raw_data.findAll('div', attrs={'class': "medium-4 small-6 columns"})
data = []
for i, _ in enumerate(data_contents):
person_data_tags = data_contents[i].findAll('div', attrs={'class': "sex-offender-info"})
person_data_jail = person_data_tags[0].findAll(attrs={'class': "jail-content"})
person_data = person_data_jail[0].findChildren()
person_dict = {}
for tag in person_data:
person_text = tag.text
try:
points = person_text.index(':')
person_dict[person_text[:points]] = person_text[points + 1:]
data.append(person_dict)
except ValueError:
pass
return data
def data_frame(self):
return pd.DataFrame(self.get_table())
def export_csv(self, file_name:str):
df = self.data_frame()
df.to_csv(file_name)
不是十全十美,收费也不是十全十美,只是简单的做了一个
data = ScrapeJail()
csv = data.export_csv('file_name.csv')
请添加错误所在的解释。我已经添加了两个错误的简要解释。谢谢,错别字太蠢了,但是figcaption是有意义的,你知道为什么它仍然只返回一个结果吗?我应该在表格中使用findAll吗?changed table=soup.find(“body”)谢谢你抓到了打字错误的家伙。figcaption更改修复了该部分,但仍然存在一些问题,它没有遍历整个页面,只返回一个结果