Python 使用BeautifulSoup从网页中删除特定链接

Python 使用BeautifulSoup从网页中删除特定链接,python,beautifulsoup,Python,Beautifulsoup,我不熟悉从以下页面刮取并尝试用beautiful soup刮取房地产经纪人数据: "https://www.realtor.com/realestateagents/New-Orleans_LA/pg-1“ 我目前正在使用选择器返回页面上每个房地产经纪人的姓名和电话号码,并将其存储在字典中。我还想返回一个href值,将他们的个人页面存储在字典中 看起来jsx-1448471805有多个“a”标记类,我只需要为每个realtor返回一个href值 我正在查看的当前选择器是: link_select

我不熟悉从以下页面刮取并尝试用beautiful soup刮取房地产经纪人数据: "https://www.realtor.com/realestateagents/New-Orleans_LA/pg-1“

我目前正在使用选择器返回页面上每个房地产经纪人的姓名和电话号码,并将其存储在字典中。我还想返回一个href值,将他们的个人页面存储在字典中

看起来jsx-1448471805有多个“a”标记类,我只需要为每个realtor返回一个href值

我正在查看的当前选择器是:

link_selectors = "#agent_list_wrapper > div.jsx-372421607.cardWrapper > ul > div:nth-child(1) > div > div > div.jsx-1448471805.agent-list-card-img-wrapper.col-lg-2.col-sm-3.col-xxs-4 > a"
但我在这方面运气不好

我想知道如何找到正确的选择器,只提取每个realtor的href值中的一个存储在我当前的字典中,以及如何将其添加到字典“realtors\u data”中

这是我目前的代码:

from bs4 import BeautifulSoup
import requests
import numpy as np
import pandas as pd

realtors_data = {}
pages = np.arange(1, 2, 1)
print("PAGES: ", pages)
names_selector = "ul > div > div > div > div > div > a > div"
phone_selectors = "ul > div > div > div > div > div > div.jsx-1448471805.agent-phone.hidden-xs.hidden-xxs"
for page in pages:
    page = requests.get("https://www.realtor.com/realestateagents/New-Orleans_LA/pg-" + str(page))
    soup = BeautifulSoup(page.text, 'html.parser')
    names = soup.select(names_selector)
    phones = soup.select(phone_selectors)

    realtors = zip(names, phones)
    for name, phone in realtors:
        realtors_data[name.get_text()] = phone.get_text()


# Printing data
print(realtors_data)

谢谢大家!

查看HTML,使用HTML类导航似乎要简单得多

from bs4 import BeautifulSoup
import requests
url = "https://www.realtor.com/realestateagents/New-Orleans_LA/pg-1"
req = requests.get(url)
soup = BeautifulSoup(req.content, 'html.parser')
names = []
for m in soup.find_all("div", class_="agent-list-card"):
    names.append({"name":m.find("div", class_="agent-name").text,
                  "phone":m.find("div", class_="agent-phone").text,
                  "link":m.find("div", class_="agent-name").parent["href"]
                 })

names
输出
[{'name': 'Cathy Nunez',
  'phone': '(504) 258-5410',
  'link': '/realestateagents/cathy-nunez___3736136_103289755'},
 {'name': 'Olivia Ford',
  'phone': '(504) 343-1837',
  'link': '/realestateagents/olivia-ford_new-orleans_la_1996916_140289755'},
 {'name': 'Michelle Pennino',
  'phone': '(985) 502-1787',
  'link': '/realestateagents/michelle-pennino_mandeville_la_589632_090714455'},
 {'name': 'Lana Hunt',
  'phone': '(225) 933-6459',
  'link': '/realestateagents/lana-hunt_new-orleans_la_2053719_682189755'},
 {'name': 'Nicole Schlaudecker',
  'phone': '(504) 455-0100',
  'link': '/realestateagents/nicole-schlaudecker_metairie_la_1793628_718289755'},
 {'name': 'Jason Minardi',
  'phone': '(985) 645-1275',
  'link': '/realestateagents/jason-minardi_slidell_la_1817940_385614455'},
 {'name': 'John P. Dixon III',
  'phone': '(504) 657-0820',
  'link': '/realestateagents/john-p.-dixon-iii___3088323_713979755'},
 {'name': 'LIZ ASHE',
  'phone': '(504) 401-4285',
  'link': '/realestateagents/liz-ashe_metairie_la_34409_054499755'},
 {'name': "Steven & Heidi Blount/Heidi's Homes, LLC",
  'phone': '(985) 373-6233',
  'link': "/realestateagents/steven-&-heidi-blount-heidi's-homes,-llc_mandeville_la_1369154_537614455"},
 {'name': 'Lisa Julien',
  'phone': '(504) 247-7306',
  'link': '/realestateagents/lisa-julien_new-orleans_la_2203901_038089755'},
 {'name': 'Bonnie Buras Team',
  'phone': '(504) 392-0022',
  'link': '/realestateagents/bonnie-buras-team_belle-chasse_la_18326_371699755'},
 {'name': 'Emily B. Hoskin',
  'phone': '(504) 392-0022',
  'link': '/realestateagents/emily-b.-hoskin_belle-chasse_la_1151586_725289755'},
 {'name': 'Emily Haynie',
  'phone': '(504) 430-6004',
  'link': '/realestateagents/emily-haynie___1055620_198489755'},
 {'name': 'Patrice Milton Poree',
  'phone': '(504) 372-1100',
  'link': '/realestateagents/patrice-milton-poree_new-orleans_la_786531_025589755'},
 {'name': 'Harry VarnadoreTeam',
  'phone': '(504) 450-6916',
  'link': '/realestateagents/harry-varnadore_new-orleans_la_992038_608489755'},
 {'name': 'Leslie Heindel',
  'phone': '(504) 975-4252',
  'link': '/realestateagents/leslie-heindel_new-orleans_la_2152401_967189755'},
 {'name': 'Heather Shields',
  'phone': '(504) 450-9672',
  'link': '/realestateagents/heather-shields_new-orleans_la_3033967_680089755'},
 {'name': 'Brittany Picolo-Ramos',
  'phone': '(504) 300-5179',
  'link': '/realestateagents/brittany-picolo-ramos_metairie_la_1949330_532289755'},
 {'name': 'Brenda Kiefer',
  'phone': '(504) 441-8171',
  'link': '/realestateagents/brenda-kiefer_covington_la_1985750_774389755'},
 {'name': 'Brenda Newfield',
  'phone': '(504) 228-6500',
  'link': '/realestateagents/brenda-newfield_st.-rose_la_1886770_176289755'}]