如何在python beautifulsoup中获取下一页链接？_Python_Python 2.7_Beautifulsoup

如何在python beautifulsoup中获取下一页链接？

python python-2.7

如何在python beautifulsoup中获取下一页链接？,python,python-2.7,beautifulsoup,Python,Python 2.7,Beautifulsoup,我有这个链接： http://www.brothersoft.com/windows/categories.html 我正在尝试获取div中项目的链接。例如：我尝试过以下代码： import urllib from bs4 import BeautifulSoup url = 'http://www.brothersoft.com/windows/categories.html' pageHtml = urllib.urlopen(url).read() soup = Beautif

我有这个链接：

http://www.brothersoft.com/windows/categories.html

我正在尝试获取div中项目的链接。例如：

我尝试过以下代码：

import urllib
from bs4 import BeautifulSoup

url = 'http://www.brothersoft.com/windows/categories.html'

pageHtml = urllib.urlopen(url).read()

soup = BeautifulSoup(pageHtml)

sAll = [div.find('a') for div in soup.findAll('div', attrs={'class':'brLeft'})]

for i in sAll:
    print "http://www.brothersoft.com"+i['href']

但我只得到输出：

http://www.brothersoft.com/windows/mp3_audio/

如何获得所需的输出？

Url

http://www.brothersoft.com/windows/mp3_audio/midi_tools/

不在标记

中，因此如果输出为

http://www.brothersoft.com/windows/mp3_audio/

，没错

如果您想获得所需的url，请更改

sAll = [div.find('a') for div in soup.findAll('div', attrs={'class':'brLeft'})]

到

更新：

获取“midi_工具”内部信息的示例

import urllib 
from bs4 import BeautifulSoup

url = 'http://www.brothersoft.com/windows/categories.html'
pageHtml = urllib.urlopen(url).read()
soup = BeautifulSoup(pageHtml)
sAll = [div.find('a') for div in soup.findAll('div', attrs={'class':'brRight'})]
for i in sAll:
    suburl = "http://www.brothersoft.com"+i['href']    #which is a url like 'midi_tools'

    content = urllib.urlopen(suburl).read()
    anosoup = BeautifulSoup(content)
    ablock = anosoup.find('table',{'id':'courseTab'})
    for atr in ablock.findAll('tr',{'class':'border_bot '}):
        print atr.find('dt').a.string      #name
        print "http://www.brothersoft.com" + atr.find('a',{'class':'tabDownload'})['href']   #link

工作完美，有什么问题吗？输出应该是如何获取midi_工具中的应用程序名称和链接？@wan mohd payed，这与您所做的类似，获取midi_工具页面的内容，并找出信息所在的标签，然后使用

BeautifulSoup

获取信息。@Davd.Zheng我需要使用“加入”还是什么？@wan mohd payed，对不起，我不明白你使用“加入”是什么意思？为了什么？对不起，我实际上不知道如何编码。但是我怎样才能获得midi_工具中的下载链接并打印一些有关midi_工具中软件的字符串信息呢

sAll = [div.find('a') for div in soup.findAll('div', attrs={'class':'brRight'})]

import urllib 
from bs4 import BeautifulSoup

url = 'http://www.brothersoft.com/windows/categories.html'
pageHtml = urllib.urlopen(url).read()
soup = BeautifulSoup(pageHtml)
sAll = [div.find('a') for div in soup.findAll('div', attrs={'class':'brRight'})]
for i in sAll:
    suburl = "http://www.brothersoft.com"+i['href']    #which is a url like 'midi_tools'

    content = urllib.urlopen(suburl).read()
    anosoup = BeautifulSoup(content)
    ablock = anosoup.find('table',{'id':'courseTab'})
    for atr in ablock.findAll('tr',{'class':'border_bot '}):
        print atr.find('dt').a.string      #name
        print "http://www.brothersoft.com" + atr.find('a',{'class':'tabDownload'})['href']   #link