Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/279.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/logging/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 仅选择dl-dd标记结构中的链接_Python_Beautifulsoup - Fatal编程技术网

Python 仅选择dl-dd标记结构中的链接

Python 仅选择dl-dd标记结构中的链接,python,beautifulsoup,Python,Beautifulsoup,For保存所有链接。我们希望编写一些代码来查找该部分,然后获取该部分元素中的所有链接 from bs4 import BeautifulSoup from urllib2 import urlopen BASE_URL = "http://www.fashiontrends.pk" def get_category_links(section_url): html = urlopen(section_url).read() soup = BeautifulSoup(html,

For保存所有链接。我们希望编写一些代码来查找该部分,然后获取该部分元素中的所有链接

from bs4 import BeautifulSoup
from urllib2 import urlopen

BASE_URL = "http://www.fashiontrends.pk"

def get_category_links(section_url):
    html = urlopen(section_url).read()
    soup = BeautifulSoup(html, "lxml")
    boccat = soup.find("dl", "boccat")
    category_links = [BASE_URL + dd.a["href"] for dd in boccat.findAll("dd")]
    return category_links
使用以下选项限制您的搜索:

links = soup.select('dl.boccat dd a[href]')
将仅找到具有href属性的链接对象,该属性位于boccat类的dl标记下的dd标记下

如果您的某些URL是相对的,请在此处使用:

不需要调用。读取响应对象;美丽集团会给你打电话的

但是,您提供给我们的特定URL在提供给浏览器或urllib2的HTML中没有任何元素:


页面中没有标签,句号。您没有向我们显示任何代码来指示您正在加载的页面。

是的,您需要什么帮助?该页面中没有DL标记。。
from urlparse import urljoin

def get_category_links(section_url):
    response = urlopen(section_url)
    soup = BeautifulSoup(response, "lxml")
    return [urljoin(BASE_URL, link["href"])
            for link in soup.select('dl.boccat dd a[href]')]
>>> from urllib2 import urlopen
>>> source = urlopen('http://www.fashiontrends.pk').read()
>>> '<dl' in source
False