Web scraping beautifulsoup python中的FindAll（“a”）_Web Scraping_Beautifulsoup_Python 3.6

Web scraping beautifulsoup python中的FindAll（“a”）

web-scraping

Web scraping beautifulsoup python中的FindAll（“a”）,web-scraping,beautifulsoup,python-3.6,Web Scraping,Beautifulsoup,Python 3.6,Python新手，有人能解释一下下面代码中的findAll（“a”）是什么意思吗？我能用其他的字母代替那个吗？比如g，h，m？“a”是指在文章中找到“a”吗和href=re.compile（“^（/wiki/）（（？！：））*$”）意味着找到那些以wiki命名的链接 from urllib.request import urlopen from bs4 import BeautifulSoup import re html = urlopen("http://en.wikipedia.org/

Python新手，有人能解释一下下面代码中的

findAll（“a”）

是什么意思吗？我能用其他的字母代替那个吗？比如g，h，m？“a”是指在文章中找到“a”吗

和

href=re.compile（“^（/wiki/）（（？！：））*$”）

意味着找到那些以wiki命名的链接

from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
html = urlopen("http://en.wikipedia.org/wiki/Kevin_Bacon")
bsObj = BeautifulSoup(html)
for link in bsObj.find("div", {"id":"bodyContent"}).findAll("a",
href=re.compile("^(/wiki/)((?!:).)*$")):
    if 'href' in link.attrs:
        print(link.attrs['href'])

有人能推荐一些好书来学习Python3.6中的网页抓取，初学者可以很容易地学习到这些书吗

findAll（“a”）

表示搜索所有“a”（锚定）标记

是的，您可以使用“h”、“b”、“strong”和任何其他有效的html标记名来代替“a”

你可以从中学到更多

另外，

re.compile（“^（/wiki/）（（？！：））*$”）

将获得以

wiki

findAll（“a”）开头的所有链接。

意味着搜索所有“a”（锚定）标记

是的，您可以使用“h”、“b”、“strong”和任何其他有效的html标记名来代替“a”

你可以从中学到更多

另外，

re.compile（^（/wiki/）（（？！：）*$）

将获得以

wiki

images=bsObj.findAll（“img”，“src”：re.compile（\.\.\/img\/gifts/img.\.jpg”））开头的所有链接，但在这段代码中，我们在find all？images=bsj.findAll（“img”，“src”）：re.compile）之后使用（“img”）(“\.\.\/img\/gifts/img.*\.jpg”）}）用于图像中的图像：print（图像[“src”]），但在这段代码中，我们在查找所有内容之后使用（“img”）？