Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/278.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何从网站上包含特定字符串的所有段落中提取文本_Python_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 如何从网站上包含特定字符串的所有段落中提取文本

Python 如何从网站上包含特定字符串的所有段落中提取文本,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,通过这个我遇到了一个问题。 我想以表格形式提取我的本地语言及其含义 import requests from bs4 import BeautifulSoup res2 = requests.get('https://steemit.com/nigeria/@leopantro/50-yoruba-proverbs-and-idioms') soup2 = BeautifulSoup(res2.content,'html') Yoruba = [] English = [] for ol i

通过这个我遇到了一个问题。 我想以表格形式提取我的本地语言及其含义

import requests
from bs4 import BeautifulSoup

res2 = requests.get('https://steemit.com/nigeria/@leopantro/50-yoruba-proverbs-and-idioms')
soup2 = BeautifulSoup(res2.content,'html')

Yoruba = []
English = []
for ol in soup2.findAll('ol'):
   proverb = ol.find('li')
   Yoruba.append(proverb.text)

我成功地将我的本地语言提取到一个列表中,我还想提取每个以字符串开头的句子的意思:到另一个列表中,例如:[“你的生活状态决定了你对同龄人的态度”,“举止成熟,避免坏名声”,等等]

只需搜索所有段落,并检查段落文本是否以含义开头

试试这个:

import requests
from bs4 import BeautifulSoup

res2 = requests.get('https://steemit.com/nigeria/@leopantro/50-yoruba-proverbs-and-idioms')
soup2 = BeautifulSoup(res2.content,'html')

yoruba = []
english = []
for ol in soup2.findAll('ol'):
    proverb = ol.find('li')
    yoruba.append(proverb.text)

for paragraph in soup2.findAll('p'):
    if paragraph.text.startswith("Meaning:"):
        english.append(paragraph.text)

english = [x.replace("Meaning: ", "") for x in english]
print(english)
打印出:

[' Necessity is mother of invention, creativity is often achieved after overcoming many difficulties.',
 ' Everyone looks the same on the outside but everyone has problems that are invisible to outsiders.',
...

只需搜索所有段落,并检查段落文本是否以含义开头

试试这个:

import requests
from bs4 import BeautifulSoup

res2 = requests.get('https://steemit.com/nigeria/@leopantro/50-yoruba-proverbs-and-idioms')
soup2 = BeautifulSoup(res2.content,'html')

yoruba = []
english = []
for ol in soup2.findAll('ol'):
    proverb = ol.find('li')
    yoruba.append(proverb.text)

for paragraph in soup2.findAll('p'):
    if paragraph.text.startswith("Meaning:"):
        english.append(paragraph.text)

english = [x.replace("Meaning: ", "") for x in english]
print(english)
打印出:

[' Necessity is mother of invention, creativity is often achieved after overcoming many difficulties.',
 ' Everyone looks the same on the outside but everyone has problems that are invisible to outsiders.',
...

该脚本从谚语、翻译和含义中提取,并从中创建一个数据框架。含义列表位于数据[“含义”]内:


该脚本从谚语、翻译和含义中提取,并从中创建一个数据框架。含义列表位于数据[“含义”]内:


你能更具体地说明问题是什么吗?此外,变量和函数名应遵循小写字母_加下划线的样式。在成功地将我的所有本地语言提取到一个列表(即上面的代码)后,通过上面的网站把它的意思提取到一个列表中是我遇到的问题。你可以提供一个相当于HTML的小片段,而不是引用网站内容本身。看看如何创建一个。你能更具体地说明问题是什么吗?此外,变量和函数名应遵循小写字母_加下划线的样式。在成功地将我的所有本地语言提取到一个列表(即上面的代码)后,通过上面的网站把它的意思提取到一个列表中是我遇到的问题。你可以提供一个相当于HTML的小片段,而不是引用网站内容本身。请参见如何创建一个。