访问python以获取循环值

访问python以获取循环值,python,csv,web-scraping,beautifulsoup,export-to-csv,Python,Csv,Web Scraping,Beautifulsoup,Export To Csv,我正在用Python做一些实验,并试图构建一个刮板。我已经拥有的代码打印在下面 import requests from bs4 import BeautifulSoup import csv url = "http://www.grammy.com/nominees/search" r = requests.get(url) soup = BeautifulSoup(r.content) g_data = soup.find_all("div", {"class": "view-cont

我正在用Python做一些实验,并试图构建一个刮板。我已经拥有的代码打印在下面

import requests
from bs4 import BeautifulSoup
import csv

url = "http://www.grammy.com/nominees/search"
r = requests.get(url)

soup = BeautifulSoup(r.content)

g_data = soup.find_all("div", {"class": "view-content"})

f = csv.writer(open("file.csv", "w"))
f.writerow(["Year", "Category", "Title", "Winner"])

for item in g_data:
  for year in item.find_all("td", {"class": "views-field-year"}):
    year = year.contents[0]

  for category in item.find_all("td", {"class": "views-field-category-code"}):
    category = category.contents[0]

  for title in item.find_all("td", {"class": "views-field-field-nominee-work"}):
    title = title.contents[0]

  for winner in item.find_all("td", {"class": "views-field-field-nominee-extended"}):
    winner = winner.contents[0]

f.writerow([year, category, title, winner])

出于某种原因,CSV文件只有一行,是随机的。我如何才能访问
范围之外的所有这些值?

您的写调用在循环之外,因此您只写了一行(最后一行)。缩进它,它应该按预期工作:

for item in g_data:
  for year in item.find_all("td", {"class": "views-field-year"}):
    year = year.contents[0]

  for category in item.find_all("td", {"class": "views-field-category-code"}):
    category = category.contents[0]

  for title in item.find_all("td", {"class": "views-field-field-nominee-work"}):
    title = title.contents[0]

  for winner in item.find_all("td", {"class": "views-field-field-nominee-extended"}):
    winner = winner.contents[0]

  f.writerow([year, category, title, winner])

如果您是Python新手,代码块是通过缩进来定义的。

不仅仅是您的上一个
writerow()
没有正确缩进(它应该在循环体下)。此外,您需要迭代
tr
元素(表示包含数据的所需
表中的每一行),为循环中找到的每个
tr
获取
td
元素

我还避免检查循环中
td
元素的
class
属性值,只需通过索引获取它们——换句话说,为每个
tr
查找所有
td
元素,并获取
文本

修正版和改进版(仅2行代码):

运行代码后,
文件.csv
的内容:

Year,Category,Title,Winner
2014,Record Of The Year,Stay With Me (Darkchild Version),"Sam Smith, artist. Steve Fitzmaurice, Rodney Jerkins & Jimmy Napes, producers. Matthew Champlin, Steve Fitzmaurice, Jimmy Napes & Steve Price, engineers/mixers. Tom Coyne, mastering engineer."
2014,Album Of The Year,Morning Phase,"Beck Hansen, producer; Tom Elmhirst, David Greenbaum, Cole Marsden Greif-Neill, Florian Lagatta, Robbie Nelson, Darrell Thorp, Cassidy Turbin & Joe Visciano, engineers/mixers; Bob Ludwig, mastering engineer."
2014,Song Of The Year,Stay With Me (Darkchild Version),"James Napier, William Phillips &Sam Smith, songwriters."
...
2014,Best Rap Song,I,"K. Duckworth, Ronald Isley & C. Smith, songwriters."
2014,Best Rap Album,The Marshall Mathers LP2,"Eminem, artist. Tony Campana, Joe Strange & Mike Strange, engineers/mixers."

是的,但是如何定义“未定义”字符串。例如:在循环的第一个
中,其他值没有定义Yet将这些字段拆分为独立列的最佳方法是什么?@BobWassermann您能详细说明“将这些字段拆分为独立列”是什么意思吗?现在Python将所有数据放在一列中,如下所示“2014年,最佳乡村独奏表演,水中之物,”艺术家Carrie Underwood。“2014年,最佳乡村独奏…等需要在单独的列中。TLDR;每个
应打开一个新列。分隔符“;”已修复。
Year,Category,Title,Winner
2014,Record Of The Year,Stay With Me (Darkchild Version),"Sam Smith, artist. Steve Fitzmaurice, Rodney Jerkins & Jimmy Napes, producers. Matthew Champlin, Steve Fitzmaurice, Jimmy Napes & Steve Price, engineers/mixers. Tom Coyne, mastering engineer."
2014,Album Of The Year,Morning Phase,"Beck Hansen, producer; Tom Elmhirst, David Greenbaum, Cole Marsden Greif-Neill, Florian Lagatta, Robbie Nelson, Darrell Thorp, Cassidy Turbin & Joe Visciano, engineers/mixers; Bob Ludwig, mastering engineer."
2014,Song Of The Year,Stay With Me (Darkchild Version),"James Napier, William Phillips &Sam Smith, songwriters."
...
2014,Best Rap Song,I,"K. Duckworth, Ronald Isley & C. Smith, songwriters."
2014,Best Rap Album,The Marshall Mathers LP2,"Eminem, artist. Tony Campana, Joe Strange & Mike Strange, engineers/mixers."