Web scraping Beauty soup将URL中的某些符号替换为其他符号_Web Scraping_Character Encoding_Beautifulsoup

Web scraping Beauty soup将URL中的某些符号替换为其他符号

web-scraping character-encoding

Web scraping Beauty soup将URL中的某些符号替换为其他符号,web-scraping,character-encoding,beautifulsoup,Web Scraping,Character Encoding,Beautifulsoup,我正在用Beautiful soup解析某个网页，试图检索h3标签中的所有链接： page = = requests.get(https://www....) soup = BeautifulSoup(page.text, "html.parser") links = [] for item in soup.find_all('h3'): links.append(item.a['href'] 但是，找到的链接与页面中的链接不同。例如，当链接出现在页面中时，Beauty soup返回，将“？

我正在用Beautiful soup解析某个网页，试图检索h3标签中的所有链接：

page = = requests.get(https://www....)
soup = BeautifulSoup(page.text, "html.parser")
links = []
for item in soup.find_all('h3'):
 links.append(item.a['href']

但是，找到的链接与页面中的链接不同。例如，当链接出现在页面中时，Beauty soup返回，将“？”替换为“%3F”，将“=”替换为%3D。为什么呢

谢谢。

您可以使用

urllib.parse

from urllib import parse
parse.unquote(item.a['href'])

这是url转义。但我无法复制这个问题。您使用的是什么版本的Python？我使用的是Python 3.5.3。谢谢，但是您能解释一下这个问题的根源吗？原因可能是