Python 如何从网站上包含特定字符串的所有段落中提取文本
通过这个我遇到了一个问题。 我想以表格形式提取我的本地语言及其含义Python 如何从网站上包含特定字符串的所有段落中提取文本,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,通过这个我遇到了一个问题。 我想以表格形式提取我的本地语言及其含义 import requests from bs4 import BeautifulSoup res2 = requests.get('https://steemit.com/nigeria/@leopantro/50-yoruba-proverbs-and-idioms') soup2 = BeautifulSoup(res2.content,'html') Yoruba = [] English = [] for ol i
import requests
from bs4 import BeautifulSoup
res2 = requests.get('https://steemit.com/nigeria/@leopantro/50-yoruba-proverbs-and-idioms')
soup2 = BeautifulSoup(res2.content,'html')
Yoruba = []
English = []
for ol in soup2.findAll('ol'):
proverb = ol.find('li')
Yoruba.append(proverb.text)
我成功地将我的本地语言提取到一个列表中,我还想提取每个以字符串开头的句子的意思:到另一个列表中,例如:[“你的生活状态决定了你对同龄人的态度”,“举止成熟,避免坏名声”,等等]只需搜索所有段落,并检查段落文本是否以含义开头 试试这个:
import requests
from bs4 import BeautifulSoup
res2 = requests.get('https://steemit.com/nigeria/@leopantro/50-yoruba-proverbs-and-idioms')
soup2 = BeautifulSoup(res2.content,'html')
yoruba = []
english = []
for ol in soup2.findAll('ol'):
proverb = ol.find('li')
yoruba.append(proverb.text)
for paragraph in soup2.findAll('p'):
if paragraph.text.startswith("Meaning:"):
english.append(paragraph.text)
english = [x.replace("Meaning: ", "") for x in english]
print(english)
打印出:
[' Necessity is mother of invention, creativity is often achieved after overcoming many difficulties.',
' Everyone looks the same on the outside but everyone has problems that are invisible to outsiders.',
...
只需搜索所有段落,并检查段落文本是否以含义开头 试试这个:
import requests
from bs4 import BeautifulSoup
res2 = requests.get('https://steemit.com/nigeria/@leopantro/50-yoruba-proverbs-and-idioms')
soup2 = BeautifulSoup(res2.content,'html')
yoruba = []
english = []
for ol in soup2.findAll('ol'):
proverb = ol.find('li')
yoruba.append(proverb.text)
for paragraph in soup2.findAll('p'):
if paragraph.text.startswith("Meaning:"):
english.append(paragraph.text)
english = [x.replace("Meaning: ", "") for x in english]
print(english)
打印出:
[' Necessity is mother of invention, creativity is often achieved after overcoming many difficulties.',
' Everyone looks the same on the outside but everyone has problems that are invisible to outsiders.',
...
该脚本从谚语、翻译和含义中提取,并从中创建一个数据框架。含义列表位于数据[“含义”]内:
该脚本从谚语、翻译和含义中提取,并从中创建一个数据框架。含义列表位于数据[“含义”]内:
你能更具体地说明问题是什么吗?此外,变量和函数名应遵循小写字母_加下划线的样式。在成功地将我的所有本地语言提取到一个列表(即上面的代码)后,通过上面的网站把它的意思提取到一个列表中是我遇到的问题。你可以提供一个相当于HTML的小片段,而不是引用网站内容本身。看看如何创建一个。你能更具体地说明问题是什么吗?此外,变量和函数名应遵循小写字母_加下划线的样式。在成功地将我的所有本地语言提取到一个列表(即上面的代码)后,通过上面的网站把它的意思提取到一个列表中是我遇到的问题。你可以提供一个相当于HTML的小片段,而不是引用网站内容本身。请参见如何创建一个。