Python:如何将我正在抓取的信息放入excel文档或blob中,以便在代码运行时进行比较?
我正试图搜集有关州最高法院的信息,以便在信息发生变化时进行检查。我能够成功地抓取和打印信息,但我正在努力找到一种方法将其保存到excel文档或其他形式的blob存储中。以下是我当前的python代码:Python:如何将我正在抓取的信息放入excel文档或blob中,以便在代码运行时进行比较?,python,excel,screen-scraping,azure-blob-storage,Python,Excel,Screen Scraping,Azure Blob Storage,我正试图搜集有关州最高法院的信息,以便在信息发生变化时进行检查。我能够成功地抓取和打印信息,但我正在努力找到一种方法将其保存到excel文档或其他形式的blob存储中。以下是我当前的python代码: import requests from bs4 import BeautifulSoup import pandas as pd list = ['https://ballotpedia.org/Alabama_Supreme_Court', 'https://ballotpedia.org
import requests
from bs4 import BeautifulSoup
import pandas as pd
list = ['https://ballotpedia.org/Alabama_Supreme_Court',
'https://ballotpedia.org/Alaska_Supreme_Court', 'https://ballotpedia.org/Arizona_Supreme_Court',
'https://ballotpedia.org/Arkansas_Supreme_Court', 'https://ballotpedia.org/California_Supreme_Court',
'https://ballotpedia.org/Colorado_Supreme_Court',
'https://ballotpedia.org/Connecticut_Supreme_Court',
'https://ballotpedia.org/Delaware_Supreme_Court', 'https://ballotpedia.org/Florida_Supreme_Court']
for page in list:
r = requests.get(page)
soup = BeautifulSoup(r.content, 'html.parser')
print([item.text for item in soup.select("table.wikitable.sortable.jquery-tablesorter a")])'
如何将其放入excel文档或blob存储中,并在以后参考以检查信息是否已更改。谢谢大家! 做了一些调整:
import requests
from bs4 import BeautifulSoup
import pandas as pd
list = ['https://ballotpedia.org/Alabama_Supreme_Court',
'https://ballotpedia.org/Alaska_Supreme_Court', 'https://ballotpedia.org/Arizona_Supreme_Court',
'https://ballotpedia.org/Arkansas_Supreme_Court', 'https://ballotpedia.org/California_Supreme_Court',
'https://ballotpedia.org/Colorado_Supreme_Court',
'https://ballotpedia.org/Connecticut_Supreme_Court',
'https://ballotpedia.org/Delaware_Supreme_Court', 'https://ballotpedia.org/Florida_Supreme_Court']
temp_dict = {} #create empty dictionary
for page in list:
r = requests.get(page)
soup = BeautifulSoup(r.content, 'html.parser')
temp_dict[page.split('/')[-1]] = [item.text for item in soup.select("table.wikitable.sortable.jquery-tablesorter a")] #populate dictionary with state as key and the info as the value.
# The next line does the following: create dataframe from dictionary,
# orient as 'index' (this handles different lengths of arrays)
# transpose it back so state supreme courts are column headers
df = pd.DataFrame.from_dict(temp_dict, orient='index').transpose()
df.to_csv('State_Supreme_Court_Info.csv') #saves as csv
请澄清到底是什么问题。堆栈溢出不能替代指南、教程和文档,请参阅。