Python 2.7:AttributeError:';列表';对象没有属性';获取';

Python 2.7:AttributeError:';列表';对象没有属性';获取';,python,iteration,export-to-csv,Python,Iteration,Export To Csv,我已经建立了一个脚本,可以抓取英国的法院列表,生成指向每个法院地址页面的链接列表,然后想从该页面中抓取地址 到目前为止,它工作得很好,但我被困在“写入csv”位。我认为这与iteritems()缺少基于。我发现一个迭代器没有与iterable相同的方法(我在代码中使用了一个迭代器),但它并没有帮助我解决我的特定问题 这是我的密码: import csv import time import random import requests from bs4 import BeautifulSoup

我已经建立了一个脚本,可以抓取英国的法院列表,生成指向每个法院地址页面的链接列表,然后想从该页面中抓取地址

到目前为止,它工作得很好,但我被困在“写入csv”位。我认为这与
iteritems()
缺少基于。我发现一个迭代器没有与iterable相同的方法(我在代码中使用了一个迭代器),但它并没有帮助我解决我的特定问题

这是我的密码:

import csv
import time
import random
import requests
from bs4 import BeautifulSoup as bs

# lambda expression to request url and parse it through bs
soup = lambda url: bs((requests.get(url)).text, "html.parser")


def crawl_court_listings(base, buff, char):
    """  """
    # common URL segment + cuffer URL segment + end character -> URL
    url = base + buff + str(chr(char))

    # soup lambda expression -> grab first unordered list
    links = (soup(url)).find('div', {'class', 'content inner cf'}).find('ul')

    # empty dictionary
    results = {}

    # loop through links, get link title and href
    for item in links.find_all('a', href=True):
        court_link = item['href']
        title = item.string

        # generate full court address page url from href
        full_court_link = base + court_link

        # save title and full URL to results
        results[title] = full_court_link

        # increment char var by 1
        char += 1

    # return results dict and incremented char value
    return results, char


def get_court_address(court_name, full_court_link):
    """ """

    # get horrible chunk of poorly formatted address(es)
    address_blob = (soup(full_court_link)).find('div', {'id': 'addresses'}).text

    # clean the blob
    clean_address = ("\n".join(line.strip() for line in address_blob.split("\n")))

    # write to csv
    with open('court_addresses.csv', 'w') as csvfile:
        fieldnames = [court_name, full_court_link, clean_address]
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writerow(fieldnames)


if __name__ == "__main__":

    base = 'https://courttribunalfinder.service.gov.uk/'
    buff = 'courts/'

    # 65 = "A". Starting from Char "A", retrieve list of Titles and Links of for Court Addresses. Return Char +1
    results, char = crawl_court_listings(base, buff, 65)

    # 90 = "Z". Until Z, pass title and list from results into get_court_address(), then wait a few seconds
    while char <= 90:
        for t, l in results.iteritems():
            get_court_address(t, l)
            time.sleep(random.randint(0,5))
因此,我的第一个想法是,我试图将多行文本写入一个单元格,这导致了错误,但不确定如何确认这一点。我使用了
print(type(address))
,它返回为
unicode
,而不是
列表,因此我认为这并不是问题的根源。我不明白它是从哪里得到问题相关的
列表的,如果这有意义的话

如果是
iteritems()


有人能解释一下错误并告诉我解决的方向吗?

对于您正在写的每一行,您都需要传入一个字典-您正在传入标题列表

这句话应该是这样的:

{'court_name':X,'full_court_link':Y,'clean_address':Z}

HTH

您的问题在于:

writer.writerow(fieldnames)
“字段名”是字段名的列表。您需要传递键值对的dict。所以它看起来应该更像这样:

Write to us:


1st Floor

Piccadilly Exchange

Piccadilly Plaza

Manchester

Greater Manchester

M1 4AH
# write to csv
with open('court_addresses.csv', 'w') as csvfile:
    # note - these are strings, not variables
    fieldnames = ['court_name', 'full_court_link', 'clean_address']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writerow({"court_name" : court_name,
                     "full_court_link" : full_court_link},
                     "clean_address" : clean_address})

PSST:你还有一个问题。您正在为您解析的每个法庭重新打开输出文件。您可能想打开该文件一次(在_main __下),然后将句柄传递到get_court_address()

当您有一个列表而不是一个dict时,为什么要使用DictWriter?您的问题是您使用csv.DictWriter错误-这一行尤其是
csv.writerow(fieldname)
.writerow()
的输入必须是一个
dict
而不是一个列表。感谢在main下打开一次的提示,这非常有用
writer.writerow(fieldnames)
# write to csv
with open('court_addresses.csv', 'w') as csvfile:
    # note - these are strings, not variables
    fieldnames = ['court_name', 'full_court_link', 'clean_address']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writerow({"court_name" : court_name,
                     "full_court_link" : full_court_link},
                     "clean_address" : clean_address})
with open('court_addresses.csv', 'w') as csvfile:
    fieldnames = ['court_name', 'full_court_link', 'clean_address']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writerow({'court_name': court_name, 'full_court_link': full_court_link, 'clean_address': clean_address})