Python到CSV或Excel导出多表数据
嘿,伙计们,我尝试将数据写入csv文件/excel 问题是Excel文件无法正常工作,csv文件只显示“Element div at xx.xxx.xx”等。我不希望逐行填写正确的产品和价格。我更喜欢Excel。大约1000件产品将被报废 测试代码是否有效的标准代码:Python到CSV或Excel导出多表数据,python,excel,csv,Python,Excel,Csv,嘿,伙计们,我尝试将数据写入csv文件/excel 问题是Excel文件无法正常工作,csv文件只显示“Element div at xx.xxx.xx”等。我不希望逐行填写正确的产品和价格。我更喜欢Excel。大约1000件产品将被报废 测试代码是否有效的标准代码: from lxml import html import requests urls = ['http://whiskey.de/shop/Aktuell/'] for url in urls: for number
from lxml import html
import requests
urls = ['http://whiskey.de/shop/Aktuell/']
for url in urls:
for number in range(1, 15):
page_url = url + str(number) if number > 1 else url
page = requests.get(page_url)
tree = html.fromstring(page.text)
prices = tree.cssselect('div#content div.price')
names = tree.cssselect('div#content div.title a')
print 'Whiskey Preis: ', [price.text for price in prices]
print 'Whiskey Names: ', [name.text for name in names]
from lxml import html
import requests
import csv
urls = ['http://whiskey.de/shop/Schottland/']
for url in urls:
for number in range(1, 10):
page_url = url + str(number) if number > 1 else url
page = requests.get(page_url)
tree = html.fromstring(page.text)
prices = tree.cssselect('div#content div.price')
names = tree.cssselect('div#content div.title a')
with open('whiskey_test.csv', 'w') as csvfile:
fieldnames = ['Preise', 'Namen']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'Preise': prices, 'Namen': names})
CSV测试:
from lxml import html
import requests
urls = ['http://whiskey.de/shop/Aktuell/']
for url in urls:
for number in range(1, 15):
page_url = url + str(number) if number > 1 else url
page = requests.get(page_url)
tree = html.fromstring(page.text)
prices = tree.cssselect('div#content div.price')
names = tree.cssselect('div#content div.title a')
print 'Whiskey Preis: ', [price.text for price in prices]
print 'Whiskey Names: ', [name.text for name in names]
from lxml import html
import requests
import csv
urls = ['http://whiskey.de/shop/Schottland/']
for url in urls:
for number in range(1, 10):
page_url = url + str(number) if number > 1 else url
page = requests.get(page_url)
tree = html.fromstring(page.text)
prices = tree.cssselect('div#content div.price')
names = tree.cssselect('div#content div.title a')
with open('whiskey_test.csv', 'w') as csvfile:
fieldnames = ['Preise', 'Namen']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'Preise': prices, 'Namen': names})
这里有几个问题 获取标记(div)文本,而不是只写标记元素本身 在“标准代码”中,您可以通过调用所选元素的.text来“正确”执行此操作
print 'Whiskey Preis: ', [price.text for price in prices]
print 'Whiskey Names: ', [name.text for name in names]
在这里
writer.writerow({'Preise': prices, 'Namen': names})
将树元素本身写入fuile。所以这应该是
writer.writerow({'Preise': prices.text, 'Namen': names.text})
然而
不写几行
假设您有一个包含价格/名称对的数据结构(如元组列表或字典),那么您可以在DictWriter中重复它,这样写一行新行:
for price, name in in productdictionary.items():
writer.writerow({'Preis': price.text, 'Name': name.text})
创建适当的数据结构,将价格映射到名称
现在,我不认为我们应该为您完成所有的刮取工作:)但是基本上您希望使用您的树来获取相应的price和name元素对。一旦你得到了它们,你就把它们放到字典或元组列表中。
一旦你得到它,你只需重复它来写你的csv,如上图所示。你能举一个例子说明
prices
和names
的样子吗?您可能需要访问属性text
或类似的内容,以获取所选内容的实际内容。