Python 为什么赢了'；ListListFunction可以遍历URL吗？_Python_Python 3.x

Python 为什么赢了'；ListListFunction可以遍历URL吗？

python python-3.x

Python 为什么赢了'；ListListFunction可以遍历URL吗？,python,python-3.x,Python,Python 3.x,我试图编写一个程序，打开一个url，在某一行中找到一个名称，然后保存它。然后，它应该在与名称相同的行中找到url，打开它，并在与上一页相同的行中找到名称+url。它应该这样做4次我无法让它遍历新的url参数。它不断返回相同的名称和url。这里出了什么问题？谢谢 from bs4 import BeautifulSoup from urllib.request import urlopen import re import ssl linklist = list() namelist = lis

我试图编写一个程序，打开一个url，在某一行中找到一个名称，然后保存它。然后，它应该在与名称相同的行中找到url，打开它，并在与上一页相同的行中找到名称+url。它应该这样做4次

我无法让它遍历新的url参数。它不断返回相同的名称和url。这里出了什么问题？谢谢

from bs4 import BeautifulSoup
from urllib.request import urlopen
import re
import ssl
linklist = list()
namelist = list()
linelist = list()
count = 0
listposition = int(input("Please enter list position: "))
goodnamelist = list(["Fikret"])
nexturl = "http://py4e-data.dr-chuck.net/known_by_Fikret.html"
def listfunction(url):
    ctx = ssl.create_default_context()
    #Allows reading of HTTPS pages
    ctx.check_hostname = False
    ctx.verify_mode = ssl.CERT_NONE
    html = urlopen(url, context=ctx).read()
    soup = BeautifulSoup(html, "html.parser")
    linelist = soup('a')
    for line in linelist:
        #Creates list of lines in webpage:
        linklist.append(re.findall("(http://.+)\"", str(line)))
        #Creates list of names in line:
        namelist.append(re.findall(">(.+)</a>", str(line)))
    #Creates list of names in the designated user-input position:
    goodnamelist.append(namelist[listposition][0])
    nexturl = linklist[listposition][0]
    return nexturl
while (count < 4):
    nexturl = listfunction(nexturl)
    print(listfunction(nexturl))
    count += 1
    print(nexturl)
    continue
print(linelist)
print(linklist)
print(namelist)
print(nexturl)
print(goodnamelist)
print(listfunction(nexturl))

从bs4导入美化组
从urllib.request导入urlopen
进口稀土
导入ssl
linklist=list（）
名称列表=列表（）
linelist=list（）
计数=0
listposition=int（输入（“请输入列表位置：”）
goodnamelist=列表（[“Fikret”]）
nexturl=”http://py4e-data.dr-chuck.net/known_by_Fikret.html"
def列表函数（url）：
ctx=ssl.create\u default\u context（）
#允许读取HTTPS页面
ctx.check_hostname=False
ctx.verify_mode=ssl.CERT_NONE
html=urlopen（url，context=ctx）.read（）
soup=BeautifulSoup（html，“html.parser”）
linelist=soup（'a'）
对于行列表中的行：
#创建网页中的行列表：
linklist.append（re.findall（“（http://.+）\”，str（line）））
#在以下行中创建名称列表：
namelist.append（re.findall（“>（.+）”，str（line）））
#在指定的用户输入位置创建名称列表：
goodnamelist.append（名称列表[listposition][0]）
nexturl=linklist[listposition][0]
下星期返回
而（计数<4）：
nexturl=列表函数（nexturl）
打印（列表功能（下一步））
计数+=1
打印（下一页）
持续
打印（行列表）
打印（链接列表）
打印（姓名列表）
打印（下一页）
打印（商品名称列表）
打印（列表功能（下一步））

您实际上没有在

listfunction（）中设置nextur

。因此，该方法每次只返回相同的初始全局变量。

我无意中删除了该行，但它有完全相同的问题。inIt与

listposition

和

linklist

相同。这两种方法实际上都没有改变，您只需将其附加到

linklist

，没有任何效果。我很抱歉不确定你是否打算在每次跑步时清除

链接列表

？这是我没有领会到的。非常感谢，我投了更高的票，但它说直到我有了15个声誉才公开显示出来。@Arincanner你可以接受答案。这实际上比投票更重要。你不需要使用

re

来获得你想要的数据，bs4 o如果您想方便地获取这些内容，那么，

line[“href”]

将为您提供链接，而

line.get_text（）

将为您提供名称，如果您不需要保存所有名称和链接列表，您可以通过

line=soup（“a”）[listposition]直接找到您想要的名称和链接

迭代给定次数的一种更好的方法是，对uu-in-range（循环次数）进行如下循环：…，这里对一次性变量有一个小小的约定，这样您就不必担心增加计数器“我无法让它迭代新的url参数”。它不断返回相同的名称和url“调用函数时会发生什么？这与应该发生的有什么不同？”？