Python 3.x For循环迭代没有期望的效果_Python 3.x

Python 3.x For循环迭代没有期望的效果

python-3.x

Python 3.x For循环迭代没有期望的效果,python-3.x,Python 3.x,我有下面的代码刮一个网站，并将结果写入一个csv文件。问题是for循环出于某种原因打印了每个迭代的多个副本，在这些副本中，它应该只编写每个迭代一次。有人能帮忙指出我在这里遗漏了什么吗？多谢各位 import requests from bs4 import BeautifulSoup import csv url = 'https://online.computicket.com' home_page = requests.get(url) home_page.content soup

我有下面的代码刮一个网站，并将结果写入一个csv文件。问题是for循环出于某种原因打印了每个迭代的多个副本，在这些副本中，它应该只编写每个迭代一次。有人能帮忙指出我在这里遗漏了什么吗？多谢各位

import requests
from bs4 import BeautifulSoup
import csv

url = 'https://online.computicket.com'
home_page = requests.get(url)

home_page.content

soup = BeautifulSoup(home_page.content, 'lxml')


links = soup.find_all('a', {'class':'info'})

next_link = []

for link in links:
    next_link.append(link.get("href"))


for i in range(0, len(next_link),1):    
    next_link.append(i)
    print(url + next_link[i])
    new_url = requests.get(url + next_link[i])   

    for link in (url + next_link[i]):
        new_url.content
        soup = BeautifulSoup(new_url.content, 'lxml')

        info_name = soup.find('div', {'class' : 'es-cost'}) 
        heading = soup.find('h1',{'class' : 'full'})

        with open('Don.csv', 'a') as csv_file:

            #csv_file.write(heading.get_text())
            for name in soup.find_all('div', {'class' : 'es-cost'}):
                csv_file.write(heading.get_text())
                csv_file.write(name.get_text())

                print(name.get_text())

我认为您的程序打印多个副本是因为嵌套for循环。但是，它的

链接

变量不在循环中的任何位置使用。尝试删除嵌套for语句，替换这部分代码：

for i in range(0, len(next_link),1):    
next_link.append(i)
print(url + next_link[i])
new_url = requests.get(url + next_link[i])   

for link in (url + next_link[i]):
    new_url.content
    soup = BeautifulSoup(new_url.content, 'lxml')

    info_name = soup.find('div', {'class' : 'es-cost'}) 
    heading = soup.find('h1',{'class' : 'full'})

    with open('Don.csv', 'a') as csv_file:

        #csv_file.write(heading.get_text())
        for name in soup.find_all('div', {'class' : 'es-cost'}):
            csv_file.write(heading.get_text())
            csv_file.write(name.get_text())

            print(name.get_text())

用这个

for i in range(0, len(next_link),1):    
next_link.append(i)
print(url + next_link[i])
new_url = requests.get(url + next_link[i])   

new_url.content
soup = BeautifulSoup(new_url.content, 'lxml')

info_name = soup.find('div', {'class' : 'es-cost'}) 
heading = soup.find('h1',{'class' : 'full'})

with open('Don.csv', 'a') as csv_file:

    #csv_file.write(heading.get_text())
    for name in soup.find_all('div', {'class' : 'es-cost'}):
        csv_file.write(heading.get_text())
        csv_file.write(name.get_text())

        print(name.get_text())

不理解这一行

中的链接（url+next\u link[i]）：

（url+next\u link[i]）将是一个字符串，这就像

中的链接（some string）：

所以（url+next\u link[i]）中的每个字符都会写入csv\u文件祝您好运。谢谢你的回复。非常感谢。另外，我是python新手，所以如果我的问题听起来很愚蠢，我很抱歉。您指出的代码的目的是：我连接到主页（eg），从那里我找到包含我想要的数据的所有扩展链接，例如（/web/event/nba_africa_2018/1225212005/Show:Additional Info）。然后我将主页和扩展连接起来，连接到新的url并开始抓取数据。问题似乎是它为所有链接写入了正确的数据，但多次除外。非常感谢您的帮助，在按照您的建议修改了我的代码之后，还发现嵌套for循环基本上是在每个字符上迭代，并在每次迭代后将信息写入文件，该代码现在具有所需的结果。再一次。非常感谢。