Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/322.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/html/87.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python使用Beauty soup从html中提取属性_Python_Html_Beautifulsoup - Fatal编程技术网

Python使用Beauty soup从html中提取属性

Python使用Beauty soup从html中提取属性,python,html,beautifulsoup,Python,Html,Beautifulsoup,我试图使用Python中的BeautifulSoup库从html脚本中提取jpg图像名称。在url中,无论您在哪里找到srcset,它总是以jpg文件名进行。我希望以这种方式提取所有jpg文件,但是每当我运行以下代码时,它就会打印出None。然而,在url中,srcset之后总是有一个jpg文件名。例如,“srcset=”https://img.shopstyle-cdn.com/pim/31/94/3194ec1ca5e3a56cb83f708533b9084d_best.jpg“”可以在ht

我试图使用Python中的
BeautifulSoup
库从html脚本中提取jpg图像名称。在url中,无论您在哪里找到
srcset
,它总是以jpg文件名进行。我希望以这种方式提取所有jpg文件,但是每当我运行以下代码时,它就会打印出
None
。然而,在url中,srcset之后总是有一个jpg文件名。例如,“
srcset=”https://img.shopstyle-cdn.com/pim/31/94/3194ec1ca5e3a56cb83f708533b9084d_best.jpg“
”可以在html中找到

import urllib2 
html = urllib2.urlopen("https://www.shopstyle.com/p/prada-notch-lapel-fitted-blazer/645742403").read()

from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')

print soup.find(attrs= {"img":"srcset"})
试试这个:

soup.find('img')['srcset']
'https://img.shopstyle-cdn.com/pim/31/94/3194ec1ca5e3a56cb83f708533b9084d_best.jpg'
试试这个:

soup.find('img')['srcset']
'https://img.shopstyle-cdn.com/pim/31/94/3194ec1ca5e3a56cb83f708533b9084d_best.jpg'

要从
srcset
查找所有URL,可以执行以下操作:

import urllib2 
html = urllib2.urlopen("https://www.shopstyle.com/p/prada-notch-lapel-fitted-blazer/645742403").read()

from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')

for el in soup.findAll('img', attrs = {'srcset' : True}):
    print el['srcset']

您的查询返回
None
,因为参数
attrs
需要一个属性为键、筛选器为值的字典。请参阅

中的说明,从
srcset
中查找所有URL,您可以执行以下操作:

import urllib2 
html = urllib2.urlopen("https://www.shopstyle.com/p/prada-notch-lapel-fitted-blazer/645742403").read()

from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')

for el in soup.findAll('img', attrs = {'srcset' : True}):
    print el['srcset']
您的查询返回
None
,因为参数
attrs
需要一个属性为键、筛选器为值的字典。请看下面的解释

我想提取所有的jpg文件

输出:

['https://img.shopstyle-cdn.com/pim/31/94/3194ec1ca5e3a56cb83f708533b9084d_best.jpg', 'https://img.shopstyle-cdn.com/pim/16/c3/16c3e46d3547d6404ba29b61b8f229fd_best.jpg', 'https://img.shopstyle-cdn.com/pim/65/e6/65e6d0e3c0160f0aca361934b999f0c9_best.jpg', 'https://img.shopstyle-cdn.com/sim/31/94/3194ec1ca5e3a56cb83f708533b9084d/prada-notch-lapel-fitted-blazer.jpg', 'https://img.shopstyle-cdn.com/sim/16/c3/16c3e46d3547d6404ba29b61b8f229fd/prada-notch-lapel-fitted-blazer.jpg', 'https://img.shopstyle-cdn.com/sim/65/e6/65e6d0e3c0160f0aca361934b999f0c9/prada-notch-lapel-fitted-blazer.jpg', 'https://img.shopstyle-cdn.com/pim/73/76/737689fa284d6640f7619e5f2f3558a5_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/2c/b0/2cb0acb147bd20df78bc482d66d7218b_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/5c/20/5c20824543749df684f3264c5e976e8c_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/48/b8/48b81f60d61e5c23cdfa343940e43ce9_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/ff/08/ff081818581b0363d4c0ec02c2cba5d4_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/86/0a/860ae7abdde0bf40046d53668abbe126_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/2f/5c/2f5c78d017052b14fd2db0d886a2a326_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/49/d5/49d5de5b62e6ddc0864afee987dd5e67_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/50/04/5004bf25e97ac0e4564d8a219a3b34b4_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/a8/76/a876ac6696e140f34e4cf82b5dbcaadf_xlarge.jpg']
我想提取所有的jpg文件

输出:

['https://img.shopstyle-cdn.com/pim/31/94/3194ec1ca5e3a56cb83f708533b9084d_best.jpg', 'https://img.shopstyle-cdn.com/pim/16/c3/16c3e46d3547d6404ba29b61b8f229fd_best.jpg', 'https://img.shopstyle-cdn.com/pim/65/e6/65e6d0e3c0160f0aca361934b999f0c9_best.jpg', 'https://img.shopstyle-cdn.com/sim/31/94/3194ec1ca5e3a56cb83f708533b9084d/prada-notch-lapel-fitted-blazer.jpg', 'https://img.shopstyle-cdn.com/sim/16/c3/16c3e46d3547d6404ba29b61b8f229fd/prada-notch-lapel-fitted-blazer.jpg', 'https://img.shopstyle-cdn.com/sim/65/e6/65e6d0e3c0160f0aca361934b999f0c9/prada-notch-lapel-fitted-blazer.jpg', 'https://img.shopstyle-cdn.com/pim/73/76/737689fa284d6640f7619e5f2f3558a5_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/2c/b0/2cb0acb147bd20df78bc482d66d7218b_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/5c/20/5c20824543749df684f3264c5e976e8c_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/48/b8/48b81f60d61e5c23cdfa343940e43ce9_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/ff/08/ff081818581b0363d4c0ec02c2cba5d4_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/86/0a/860ae7abdde0bf40046d53668abbe126_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/2f/5c/2f5c78d017052b14fd2db0d886a2a326_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/49/d5/49d5de5b62e6ddc0864afee987dd5e67_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/50/04/5004bf25e97ac0e4564d8a219a3b34b4_xlarge.jpg', 'https://img.shopstyle-cdn.com/pim/a8/76/a876ac6696e140f34e4cf82b5dbcaadf_xlarge.jpg']

你能给出能找到图像的整个标记吗?希望它有助于soup.find('img',attrs={'srcset':True})你能给出能找到图像的整个标记吗?希望它有助于soup.find('img',attrs={'srcset':True})