Python 将web抓取数据导出到Excel时出现问题
我从一个表中提取了数据,需要帮助将数据导出到excel。我在循环之后创建了一个数据框,但是当我打开excel文件时,我可以看到数据不在那里。相反,它看起来像html或json链接。这是我的代码,最后我将显示输出Python 将web抓取数据导出到Excel时出现问题,python,excel,pandas,web-scraping,Python,Excel,Pandas,Web Scraping,我从一个表中提取了数据,需要帮助将数据导出到excel。我在循环之后创建了一个数据框,但是当我打开excel文件时,我可以看到数据不在那里。相反,它看起来像html或json链接。这是我的代码,最后我将显示输出 import time import requests from lxml import html from selenium import webdriver from selenium.webdriver.common.keys import Keys import pandas
import time
import requests
from lxml import html
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import pandas as pd
url = "https://coinmarketcap.com/"
driver = webdriver.Chrome(executable_path=r'C:\Users\Ejer\PycharmProjects\pythonProject\chromedriver.exe')
driver.get(url)
driver.maximize_window()
for i in range(22):
coin_name = driver.find_elements_by_xpath('//td[3]/a[@class="cmc-link" and starts-with(@href, "/currencies/")]')
prices = driver.find_elements_by_xpath('//td[4]/div/a[@class = "cmc-link" and contains(@href, "/markets/")]')
market_caps = driver.find_elements_by_xpath('//td[7]/p[@class="sc-1eb5slv-0 kDEzev"]')
html = driver.find_element_by_tag_name('html')
html.send_keys(Keys.PAGE_DOWN)
for coin, price, cap in zip(coin_name, prices, market_caps):
print("Coin :", coin.text, "Price :", price.text, "Market cap:", cap.text)
print(len(prices))
print(len(coin_name))
print(len(market_caps))
print(type(prices))
print(type(coin_name))
print(type(market_caps))
df = pd.DataFrame({
'Alt coin name': coin_name,
'Price': prices,
'Market cap': market_caps
})
df.to_excel (r'C:\Users\Ejer\PycharmProjects\pythonProject\cmc_data.xlsx', index = False, header=True)
这是excel中的输出示例:
发生了什么事? 您尝试将元素直接推送到pd/excel中,但要推送的是值 因此,不必打印循环中的所有值,您可以使用它将它们放入
dicts
的列表中,并根据该结果创建df
这就是诀窍:
创建目录的列表
:
results = [{"Alt coin name": coin.text, "Price": price.text, "Market cap": cap.text} for coin, price, cap in zip(coin_name, prices, market_caps)]
创建您的df
:
df = pd.DataFrame(results)
示例
import time
import requests
from lxml import html
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import pandas as pd
url = "https://coinmarketcap.com/"
driver = webdriver.Chrome(executable_path=r'C:\Program Files\ChromeDriver\chromedriver.exe')
driver.get(url)
driver.maximize_window()
for i in range(22):
coin_name = driver.find_elements_by_xpath('//td[3]/a[@class="cmc-link" and starts-with(@href, "/currencies/")]')
prices = driver.find_elements_by_xpath('//td[4]/div/a[@class = "cmc-link" and contains(@href, "/markets/")]')
market_caps = driver.find_elements_by_xpath('//td[7]/p[@class="sc-1eb5slv-0 kDEzev"]')
html = driver.find_element_by_tag_name('html')
html.send_keys(Keys.PAGE_DOWN)
results = [{"Alt coin name": coin.text, "Price": price.text, "Market cap": cap.text} for coin, price, cap in zip(coin_name, prices, market_caps)]
df = pd.DataFrame(results)
df.to_excel (r'C:\Users\Ejer\PycharmProjects\pythonProject\cmc_data.xlsx', index = False, header=True)