Python 3.x 使用BeautifulSoup，我如何针对段落中的特定项目？_Python 3.x_Beautifulsoup

Python 3.x 使用BeautifulSoup，我如何针对段落中的特定项目？

python-3.x

Python 3.x 使用BeautifulSoup，我如何针对段落中的特定项目？,python-3.x,beautifulsoup,Python 3.x,Beautifulsoup,我从本页获取所需的正确信息时遇到一些问题：理想的情况是，我想知道学校的名称以及每一所学校的价值例如：加利福尼亚理工学院：来自戈登和贝蒂·摩尔以及戈登和贝蒂·摩尔基金会，6亿美元，包括5年3亿美元和10年3亿美元；现金和股票；2001* 理想的输出是：加州理工学院，6亿美元（用逗号分隔）您可以使用 BeautifulSoup是一个python库，它允许解析HTML和XML数据正则表达式允许搜索字符串中的某些模式 from bs4 import BeautifulSoup import

我从本页获取所需的正确信息时遇到一些问题：

理想的情况是，我想知道学校的名称以及每一所学校的价值

例如： 加利福尼亚理工学院：来自戈登和贝蒂·摩尔以及戈登和贝蒂·摩尔基金会，6亿美元，包括5年3亿美元和10年3亿美元；现金和股票；2001*

理想的输出是：加州理工学院，6亿美元

（用逗号分隔）

您可以使用

BeautifulSoup是一个python库，它允许解析HTML和XML数据

正则表达式允许搜索字符串中的某些模式

from bs4 import BeautifulSoup
import re
import urllib.request

link = 'http://www.chronicle.com/article/Major-Private-Gifts-to-Higher/128264'
req = urllib.request.Request(link, headers={'User-Agent': 'Mozilla/5.0'})
sauce = urllib.request.urlopen(req).read()
soup = BeautifulSoup(sauce, 'html.parser')

university = {}

for x in soup.find_all('p'):
    name_tag = x.find('strong')
    if name_tag != None:
        name = name_tag.text
        t = x.text
        m = re.findall('\$([0-9]*)', t)
        if m != []:
            #There is a possibility that there are more than one values gifted.
            #For example, in case of CalTech there are 3 values [600, 300, 300]
            #This can be handled in two ways.
            #Either print the first value using m[0].
            #Or find the max element of the list using max(m)        
            print(name +', ' + m[0])

非常感谢你！这很容易理解/贯彻到底。