Python 如何找到此页面上的某些链接?
请帮我把脚本写短一点Python 如何找到此页面上的某些链接?,python,beautifulsoup,Python,Beautifulsoup,请帮我把脚本写短一点 import urllib import pprint import requests import bs4 def get_friend_links(url, userName, html): soup = bs4.BeautifulSoup(html) links = soup.find('div', {'id': 'friends_overview'}) links2 = links.findAll('a', {'class': 'ips
import urllib
import pprint
import requests
import bs4
def get_friend_links(url, userName, html):
soup = bs4.BeautifulSoup(html)
links = soup.find('div', {'id': 'friends_overview'})
links2 = links.findAll('a', {'class': 'ipsUserPhotoLink'})
friendLinks = []
for el in links2:
friendLink = el['href']
friendLinks.append(friendLink)
pprint.pprint(friendLinks)
url = 'http://forum.saransk.ru/user/20892-ujdyj/'
userName = url.split('/')[-2]
userName = userName.replace('-', '_')
html = urllib.request.urlopen(url).read().decode('utf-8')
friendLinks = get_friend_links(url, userName, html)
它是有效的,但我用了太长的时间和记录周期。这不好从理论上讲,你可以用一句话来做:
friendLinks = [el['href'] for el in bs4.BeautifulSoup(html).find('div', {'id': 'friends_overview'}).findAll('a', {'class': 'ipsUserPhotoLink'})]
但这是一条很长很难读懂的线。基本上,这太多了。在我看来,一个好的中间立场是,用列表理解替换显式的
for
循环:
friendLinks = [el['href'] for el in links2]
甚至:
friendLinks = [el['href'] for el in links.findAll('a', {'class': 'ipsUserPhotoLink'})]
但在我看来,任何超过这一点的东西都会过度使用并降低可读性