Python 如何用漂亮的汤刮去所有的文字或段落
我想把段落中的所有文字都删掉。以下是我制作的链接和代码: 注意:当前要先提取10页Python 如何用漂亮的汤刮去所有的文字或段落,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,我想把段落中的所有文字都删掉。以下是我制作的链接和代码: 注意:当前要先提取10页 import requests from bs4 import BeautifulSoup import numpy as np import pandas as pd import re import time #create a list Title = [] Paragraph = [] #scrape the first 10 page data(the no of pages can be chan
import requests
from bs4 import BeautifulSoup
import numpy as np
import pandas as pd
import re
import time
#create a list
Title = []
Paragraph = []
#scrape the first 10 page data(the no of pages can be changed)
pages = np.arange(1, 10)
for page in pages:
page = requests.get("https://www.malaymail.com/news/money?page=" + str(page))
soup = BeautifulSoup(page.text, 'html.parser')
list_all = soup.find_all('div',class_='data')
for h in list_all:
title=h.find('h2').text
parag=h.find('p',attrs={'class':'summary d-none d-md-block'})
Title.append(title if title else '')
Paragraph.append(parag if parag else '')
#create a dataframe
pd.set_option('display.max_rows', None)
pd.set_option('display.max_colwidth', None)
df = pd.DataFrame({'Title':Title,
'Content':Paragraph})
df
电流输出:
所以,我想要的预期输出是:
使用find_all('p'),它将返回所有查找到的段落列表,您可以在其中进行迭代非常感谢您的帮助:)