Python 无法从';a';加上漂亮的汤

Python 无法从';a';加上漂亮的汤,python,selenium,web-scraping,beautifulsoup,Python,Selenium,Web Scraping,Beautifulsoup,我正试图通过以下代码从youtube播放列表中获取链接: from bs4 import BeautifulSoup as bs from selenium import webdriver from selenium.webdriver.common.keys import Keys import pyperclip import time url = input('Please enter youtube playlist url: ') driver = webdriver.Firefo

我正试图通过以下代码从youtube播放列表中获取链接:

from bs4 import BeautifulSoup as bs
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import pyperclip
import time

url = input('Please enter youtube playlist url: ')
driver = webdriver.Firefox()
driver.get(url)

elem = driver.find_element_by_tag_name('html')
elem.send_keys(Keys.END)
time.sleep(3)
elem.send_keys(Keys.END)

innerHTML = driver.execute_script("return document.body.innerHTML")

soup = bs(innerHTML, 'html.parser')
res = soup.select('div#content.style-scope.ytd-playlist-video-renderer a.yt-simple-endpoint.style-scope.ytd-playlist-video-renderer')

whole_list = ''

for i in res:
    print(i.get('href'))
    print(i['href'])
    print(i.attrs['href'])
#    whole_list = whole_list + "  '" + i.get('href') + "', \n"

print(whole_list)
pyperclip.copy(whole_list)

driver.close()
而youtube的播放列表视频组件在chrome开发者工具中显示如下:

<a class="yt-simple-endpoint style-scope ytd-playlist-video-renderer" href="/watch?v=QXeEoD0pB3E&amp;list=PLsyeobzWxl7poL9JTVyndKe62ieoN-MZ3&amp;index=2&amp;t=0s">
          <ytd-thumbnail id="thumbnail" height="68" width="120" class="style-scope ytd-playlist-video-renderer">


    <a id="thumbnail" class="yt-simple-endpoint inline-block style-scope ytd-thumbnail" aria-hidden="true" tabindex="-1" rel="null" href="/watch?v=QXeEoD0pB3E&amp;list=PLsyeobzWxl7poL9JTVyndKe62ieoN-MZ3&amp;index=2&amp;t=0s">
      <yt-img-shadow class="style-scope ytd-thumbnail no-transition" style="background-color: transparent;" loaded=""><img id="img" class="style-scope yt-img-shadow" alt="" width="120" src="https://i.ytimg.com/vi/QXeEoD0pB3E/hqdefault.jpg?sqp=-oaymwEZCPYBEIoBSFXyq4qpAwsIARUAAIhCGAFwAQ==&amp;rs=AOn4CLCsnnE_5VNrXFHejH29sP0T7NSSmw"></yt-img-shadow>

      <div id="overlays" class="style-scope ytd-thumbnail"><ytd-thumbnail-overlay-resume-playback-renderer class="style-scope ytd-thumbnail"><div id="progress" class="style-scope ytd-thumbnail-overlay-resume-playback-renderer" style="width: 100%;"></div></ytd-thumbnail-overlay-resume-playback-renderer><ytd-thumbnail-overlay-time-status-renderer class="style-scope ytd-thumbnail" overlay-style="DEFAULT"><span class="style-scope ytd-thumbnail-overlay-time-status-renderer" aria-label="66 seconds">
      1:06
    </span></ytd-thumbnail-overlay-time-status-renderer><ytd-thumbnail-overlay-now-playing-renderer class="style-scope ytd-thumbnail">

    <span class="style-scope ytd-thumbnail-overlay-now-playing-renderer">Now playing</span>
  </ytd-thumbnail-overlay-now-playing-renderer></div>
      <div id="mouseover-overlay" class="style-scope ytd-thumbnail"></div>
      <div id="hover-overlays" class="style-scope ytd-thumbnail"></div>
    </a>
  </ytd-thumbnail>
          <div id="meta" class="style-scope ytd-playlist-video-renderer">
            <h3 class="style-scope ytd-playlist-video-renderer">
              <ytd-badge-supported-renderer class="style-scope ytd-playlist-video-renderer">

    <dom-repeat id="repeat" as="badge" class="style-scope ytd-badge-supported-renderer"><template is="dom-repeat"></template></dom-repeat>
  </ytd-badge-supported-renderer>
              <span id="video-title" class="style-scope ytd-playlist-video-renderer" aria-label="#0 Python Tutorial | Python Programming Tutorial for Beginners | Course Introduction by Telusko 1 year ago 66 seconds 1,108,432 views" title="#0 Python Tutorial | Python Programming Tutorial for Beginners | Course Introduction">
                #0 Python Tutorial | Python Programming Tutorial for Beginners | Course Introduction
              </span>
            </h3>
            <ytd-video-meta-block class="playlist style-scope ytd-playlist-video-renderer">



    <div id="metadata" class="style-scope ytd-video-meta-block">
      <div id="byline-container" class="style-scope ytd-video-meta-block">
        <ytd-channel-name id="channel-name" class="style-scope ytd-video-meta-block">

    <div id="container" class="style-scope ytd-channel-name">
      <div id="text-container" class="style-scope ytd-channel-name">
        <yt-formatted-string id="text" class="style-scope ytd-channel-name complex-string" ellipsis-truncate="" title="Telusko" has-link-only_=""><a class="yt-simple-endpoint style-scope yt-formatted-string" spellcheck="false" href="/user/javaboynavin">Telusko</a></yt-formatted-string>
      </div>
    </div>
    <ytd-badge-supported-renderer class="style-scope ytd-channel-name" disable-upgrade="" hidden="">
    </ytd-badge-supported-renderer>
  </ytd-channel-name>
        <div id="separator" class="style-scope ytd-video-meta-block">•</div>
      </div>
      <div id="metadata-line" class="style-scope ytd-video-meta-block">
        <dom-repeat strip-whitespace="" class="style-scope ytd-video-meta-block"><template is="dom-repeat"></template></dom-repeat>
      </div>
    </div>
    <div id="additional-metadata-line" class="style-scope ytd-video-meta-block">
      <dom-repeat class="style-scope ytd-video-meta-block"><template is="dom-repeat"></template></dom-repeat>
    </div>

  </ytd-video-meta-block>
          </div>
          <ytd-badge-supported-renderer id="badges" class="style-scope ytd-playlist-video-renderer" disable-upgrade="" hidden="">
          </ytd-badge-supported-renderer>
          <yt-formatted-string id="contributor" class="style-scope ytd-playlist-video-renderer" hidden=""></yt-formatted-string>
        </a>

#0 Python教程|初学者Python编程教程|课程介绍
•
如您所见,我正在尝试使用我在网上找到的所有三个建议,即使用I.get('href')会给我空值;剩下的两个选项给了我错误。我从昨天起就一直在做这件事,找不到我做错了什么。

有时
可能没有
href
,所以我会使用
if
跳过它

for i in res:
    href = i.get('href')
    if href:
        whole_list = whole_list + "  '" + href + "', \n"

这段代码为我提供了一些播放列表的所有HREF。您可以看到,对于first
i
,它也会得到
None
,但是我跳过了这个值

from bs4 import BeautifulSoup as BS
from selenium import webdriver
import pyperclip
import time

#url = input('Please enter youtube playlist url: ')
url = 'https://www.youtube.com/playlist?list=PLmNPvQr9Tf-a4MrEG5thq3qzlkrF5NFbC'

driver = webdriver.Firefox()
driver.get(url)

time.sleep(3)
html = driver.page_source
soup = BS(html, 'html.parser')
res = soup.select('a.yt-simple-endpoint.style-scope.ytd-playlist-video-renderer')

all_hrefs = []

for i in res:
    href = i.get('href')
    print(href)
    if href:
        all_hrefs.append(href)

text = ',\n'.join(["  '{}'".format(x) for x in all_hrefs])
print(text)

pyperclip.copy(text)

driver.close()

错误是什么?您是否检查了
i
中的内容?也许您得到的项目没有
href
?您是否使用
driver.page\u source
获取HTML?您可以使用youtube api@furas我尝试使用print(res[0])打印我得到的元素作为响应,毫无疑问,它是一个标记,但我仍然无法获取href。我在一些播放列表中测试了代码,它只为第一个
I
提供
None
,但之后我为播放列表中的所有项目提供
href
。因此,我只使用
get('href')
if
跳过
None
。感谢顶部的提示。。。我使用了一个if语句,它就像一个符咒。也谢谢你的代码。想到我想不出这么简单的事情:(…顺便说一句…再次感谢你,伙计。