Html 使用BeautifulSoup进行web抓取时,无法显示a标记中的文本
我正在尝试使用.text()获取特定歌曲的持续时间。但是,当显示歌曲和艺术家的名称时,输出是简单的Html 使用BeautifulSoup进行web抓取时,无法显示a标记中的文本,html,parsing,web-scraping,beautifulsoup,python-requests,Html,Parsing,Web Scraping,Beautifulsoup,Python Requests,我正在尝试使用.text()获取特定歌曲的持续时间。但是,当显示歌曲和艺术家的名称时,输出是简单的 from bs4 import BeautifulSoup import requests import csv source=requests.get("https://gaana.com/artist/arijit-singh/latest/asc").text soup=BeautifulSoup(source,"lxml") with
from bs4 import BeautifulSoup
import requests
import csv
source=requests.get("https://gaana.com/artist/arijit-singh/latest/asc").text
soup=BeautifulSoup(source,"lxml")
with open("arijit_singh_new_update.csv","w") as arijit_csv_file:
arijit_csv_file_writer=csv.writer(arijit_csv_file)
arijit_csv_file_writer.writerow(["title","artists","duration"])
title_tag=soup.find("div",class_="playlist_thumb_det")
title=title_tag.a.text
composer_tag=soup.find("li",class_="s_artist p_artist desktop")
composer=composer_tag.a.text
duration_tag=soup.find("li",class_="s_duration")
duration=duration_tag.a.text
print(duration)
要将歌曲标题、艺术家和持续时间保存到csv文件,可以使用以下示例:
import csv
import requests
from bs4 import BeautifulSoup
url = "https://gaana.com/artist/arijit-singh/latest/asc"
soup=BeautifulSoup(requests.get(url).content,"lxml")
with open('arijit_singh_new_update.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(["title","artists","duration"])
for song, artist, duration in zip(
soup.select('.s_title a[data-type="playSong"]'),
soup.select('.s_artist [data-type="playSong"]'),
soup.select('.s_duration [data-type="playSong"]')):
writer.writerow([song.text, artist.text, duration.text])
这将在LibreOffice中创建csv文件arijit_singh_new_update.csv
,看起来:
哦,好吧,我得到了这个解决方案,但我不知道代码中的错误是什么?我可以打印出歌曲名和歌手名,但不能打印出持续时间。如果你能解释一下,我会很高兴的。无论如何,谢谢你!