Python 通过webscraping字典列表值创建打印值_Python_Web Scraping

Python 通过webscraping字典列表值创建打印值

python web-scraping

Python 通过webscraping字典列表值创建打印值,python,web-scraping,Python,Web Scraping,我想知道如何正确地将某些字符串术语放入具有值的网站列表中，并通过将其放入函数并显示值，以方便的方式处理这些术语。我还没有在searchText部分的代码底部构建函数。我不知道如何使它干净，它将保存每一个并显示在命令窗口上。我将“”作为每个值的模板如果我还需要澄清，请告诉我。多谢各位导入请求导入getpass #证书测试 cred=str（输入（'请输入：'））用户名=输入（“”）密码=getpass（） #网址 url=''+cred 第二个URL='' #数据加载加载={'user

我想知道如何正确地将某些字符串术语放入具有值的网站列表中，并通过将其放入函数并显示值，以方便的方式处理这些术语。我还没有在searchText部分的代码底部构建函数。我不知道如何使它干净，它将保存每一个并显示在命令窗口上。我将“”作为每个值的模板

如果我还需要澄清，请告诉我。多谢各位

导入请求
导入getpass
#证书测试
cred=str（输入（'请输入：'））
用户名=输入（“”）
密码=getpass（）
#网址
url=''+cred
第二个URL=''
#数据加载
加载={'user'：用户名，'pass'：密码}
#从url源抓取
打印（'请稍候…'）
将requests.Session（）作为会话：
post=session.post（第二个URL，数据=load）
s=会话.get（url）
x=[''，''，''，'']
Dict={}
a=s.text
搜索=a.split（x）[1]
结果=search.split（'>'）[2]
result=result.split（“下面的示例使用了Python3.7和BeautifulSoup4库，应该会给您一些提示和想法

它提取团队名称及其点数，并将数据存储在字典中
守则：
import requests
from bs4 import BeautifulSoup

URL = 'https://www.skysports.com/premier-league-table'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')

# navigate down to the table and then the table body:
table = soup.find("table", class_='standing-table__table')
body = table.find("tbody")

data = {}

for row in body.find_all("tr"):
    # team name is grabbed from the first <a> value:
    team = row.find("a").get_text() 
    # 10th <td> element contains the points total, as an
    # array of one element - therefore we slice [9::10]
    # and then get the first (and only) array item [0]
    points = row.findAll("td")[9::10][0].get_text()
    data[team] = points
    #print(team)
    #print(points)

print(data)

但关键是，一旦数据进入Python字典（或任何您想要的结构），就可以直接进行操作
这里的主要挑战是理解你想要抓取的站点的HTML结构，这样你就可以有效地导航HTML标记在浏览器中是一个很好的起点。
下面的示例使用Python 3.7和BeautifulSoup4库，可以为您提供一些指导和想法

它提取团队名称及其点数，并将数据存储在字典中
守则：
import requests
from bs4 import BeautifulSoup

URL = 'https://www.skysports.com/premier-league-table'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')

# navigate down to the table and then the table body:
table = soup.find("table", class_='standing-table__table')
body = table.find("tbody")

data = {}

for row in body.find_all("tr"):
    # team name is grabbed from the first <a> value:
    team = row.find("a").get_text() 
    # 10th <td> element contains the points total, as an
    # array of one element - therefore we slice [9::10]
    # and then get the first (and only) array item [0]
    points = row.findAll("td")[9::10][0].get_text()
    data[team] = points
    #print(team)
    #print(points)

print(data)

但关键是，一旦数据进入Python字典（或任何您想要的结构），就可以直接进行操作
这里的主要挑战是理解你想要抓取的站点的HTML结构，这样你就可以有效地导航HTML标记在您的浏览器中是一个很好的起点。
如果有人知道如何操作，将不胜感激。如果有人知道如何操作，将不胜感激。您好。是的，我有拆分功能，因为它会更容易列出我需要的信息，因为我不知道确切的行号和诸如此类的事情，所以我正在做列表我不知道nt打印所有内容。只打印列表中的某些值。是的，我有拆分功能，因为它更容易列出我需要的信息，因为我不知道确切的行号和诸如此类的内容，所以我正在做列表。我不想打印所有内容。只打印列表中的某些值