Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在web抓取中不迭代列表_Python_Python 3.x_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 在web抓取中不迭代列表

Python 在web抓取中不迭代列表,python,python-3.x,web-scraping,beautifulsoup,Python,Python 3.x,Web Scraping,Beautifulsoup,通过链接,我试图创建两个列表:一个是国家列表,另一个是货币列表。然而,我被困在某个点上,它只给了我第一个国家的名字,但没有迭代到所有国家的列表中。任何关于我如何解决这个问题的帮助都将不胜感激。提前谢谢 以下是我的尝试: from bs4 import BeautifulSoup import urllib.request url = "http://www.worldatlas.com/aatlas/infopage/currency.htm" headers = {'User-Agent':

通过链接,我试图创建两个列表:一个是国家列表,另一个是货币列表。然而,我被困在某个点上,它只给了我第一个国家的名字,但没有迭代到所有国家的列表中。任何关于我如何解决这个问题的帮助都将不胜感激。提前谢谢

以下是我的尝试:

from bs4 import BeautifulSoup
import urllib.request

url = "http://www.worldatlas.com/aatlas/infopage/currency.htm"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 
10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.80 
Safari/537.36'}

req = urllib.request.Request(url, headers=headers)
resp = urllib.request.urlopen(req)
html = resp.read()

soup = BeautifulSoup(html, "html.parser")
attr = {"class" : "miscTxt"}

countries = soup.find_all("div", attrs=attr)
countries_list = [tr.td.string for tr in countries]

for country in countries_list:
    print(country)

试试这个脚本。它应该给你国家名称和相应的货币。您不需要为此网站使用标题

from bs4 import BeautifulSoup
import urllib.request

url = "http://www.worldatlas.com/aatlas/infopage/currency.htm"
resp = urllib.request.urlopen(urllib.request.Request(url)).read()
soup = BeautifulSoup(resp, "lxml")

for item in soup.select("table tr"):
    try:
        country = item.select("td")[0].text.strip()
    except IndexError:
        country = ""
    try:
        currency = item.select("td")[0].find_next_sibling().text.strip()
    except IndexError:
        currency = ""
    print(country,currency)
部分输出:

Afghanistan afghani
Algeria dinar
Andorra euro
Argentina peso
Australia dollar

您还可以使用单个理解列表创建元组列表,如
[(国家、货币)]
&然后将元组转换为两个列表,其中包括:

完整代码:

from bs4 import BeautifulSoup
import urllib.request

req = urllib.request.Request("http://www.worldatlas.com/aatlas/infopage/currency.htm")

soup = BeautifulSoup(urllib.request.urlopen(req).read(), "html.parser")

countries = soup.find_all("div", attrs = {"class" : "miscTxt"})

temp_list = [
    (t[0].text.strip(), t[1].text.strip()) 
    for t in (t.find_all('td') for t in countries[0].find_all('tr'))
    if t
]

countries_list, currency_list = map(list,zip(*temp_list))

print(countries_list)
print(currency_list)

您是否打印了
国家列表
以检查它是否包含多个条目?是的,我打印了。它只打印列表中的第一个国家我刚刚检查了您的
国家列表
,它只包含
阿富汗
。这不是迭代,问题是
[tr.td.string for tr in countries]
from bs4 import BeautifulSoup
import urllib.request

req = urllib.request.Request("http://www.worldatlas.com/aatlas/infopage/currency.htm")

soup = BeautifulSoup(urllib.request.urlopen(req).read(), "html.parser")

countries = soup.find_all("div", attrs = {"class" : "miscTxt"})

temp_list = [
    (t[0].text.strip(), t[1].text.strip()) 
    for t in (t.find_all('td') for t in countries[0].find_all('tr'))
    if t
]

countries_list, currency_list = map(list,zip(*temp_list))

print(countries_list)
print(currency_list)