如何使用python在点击后显示手机时不刮手机
我想刮手机号,但手机号只会在点击后显示,所以请您是否可以直接使用python刮手机号?我的代码刮手机号,但使用starr***。这里是我想刮手机的链接:请引导我! 这是我的密码:如何使用python在点击后显示手机时不刮手机,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,我想刮手机号,但手机号只会在点击后显示,所以请您是否可以直接使用python刮手机号?我的代码刮手机号,但使用starr***。这里是我想刮手机的链接:请引导我! 这是我的密码: import requests from bs4 import BeautifulSoup def get_page(url): response = requests.get(url) if not response.ok: print('server responded:',
import requests
from bs4 import BeautifulSoup
def get_page(url):
response = requests.get(url)
if not response.ok:
print('server responded:', response.status_code)
else:
soup = BeautifulSoup(response.text, 'lxml')
return soup
def get_detail_data(soup):
try:
title = (soup.find('h1', class_="sc-AykKI",id=False).text)
except:
title = 'Empty Title'
print(title)
try:
contact_person = (soup.findAll('span', class_="Contact__Item-sc-1giw2l4-2 kBpGee",id=False)[0].text)
except:
contact_person = 'Empty Person'
print(contact_person)
try:
location = (soup.findAll('span', class_="Contact__Item-sc-1giw2l4-2 kBpGee",id=False)[1].text)
except:
location = 'Empty location'
print(location)
try:
cell = (soup.findAll('span', class_="Contact__Item-sc-1giw2l4-2 kBpGee",id=False)[2].text)
except:
cell = 'Empty Cell No'
print(cell)
try:
phone = (soup.findAll('span', class_="Contact__Item-sc-1giw2l4-2 kBpGee",id=False)[3].text)
except:
phone = 'Empty Phone No'
print(phone)
try:
Verify_ABN = (soup.find('p', class_="sc-AykKI").text)
except:
Verify_ABN = 'Empty Verify_ABN'
print(Verify_ABN)
try:
ABN = (soup.find('div', class_="box__Box-sc-1u3aqjl-0").find('a'))
except:
ABN = 'Empty ABN'
print(ABN)
def main():
#get data of detail page
url = "https://hipages.com.au/connect/abcelectricservicespl/service/126298"
#get_page(url)
get_detail_data(get_page(url))
if __name__ == '__main__':
main()
页面源中已存在电话号码。 页面源代码中有一个脚本,从
窗口开始。\uuuuu INITIAL\u STATE\uuuuuuu
,它包含一个对象,该对象具有针对多个提供商的数据,因此您可以从这里获取所有提供商的电话号码,或者只需将此对象加载到json中,并根据store作为键,获取该存储的电话号码
import requests
from bs4 import BeautifulSoup
import re
def Main():
r = requests.get(
"https://hipages.com.au/connect/abcelectricservicespl/service/126298")
soup = BeautifulSoup(r.text, 'html.parser')
name = soup.find("h1", {'class': 'sc-AykKI'}).text
print(name)
person = soup.find(
"span", {'class': 'Contact__Item-sc-1giw2l4-2 kBpGee'}).text.strip()
print(person)
addr = soup.findAll(
"span", {'class': 'Contact__Item-sc-1giw2l4-2 kBpGee'})[1].text
print(addr)
print(re.search('phone\\\\":\\\\"(.*?)\\\\"', r.text).group(1))
print(re.search('mobile\\\\":\\\\"(.*?)\\\\"', r.text).group(1))
print(re.search('abn\\\\":\\\\"(.*?)\\\\"', r.text).group(1))
print(re.search('website\\\\":\\\\"(.*?)\\\\"', r.text).group(1))
Main()
输出:
ABC电气服务损益表
马尔
222 Henry Lawson DRV,新南威尔士州乔治厅2198
1800 801 828
0408 600 950
37137808989
www.abcelectricservices.com.au
或者,如果要分析完整脚本:
导入请求
从bs4导入BeautifulSoup
导入pyjsparser
导入json
进口稀土
def Main():
r=requests.get(
"https://hipages.com.au/connect/abcelectricservicespl/service/126298")
soup=BeautifulSoup(r.text'html.parser')
phone=soup.findAll(“脚本”)[5]
tree=pyjsparser.parse(phone.text)
打印(json.loads(树[“body”][0][“expression”][“right”][“value”]))
Main()
另一个版本:
导入请求
从bs4导入BeautifulSoup
进口稀土
导入json
def Main():
r=requests.get(
"https://hipages.com.au/connect/abcelectricservicespl/service/126298")
soup=BeautifulSoup(r.text'html.parser')
data=soup.findAll(“脚本”)[5]。文本
source=re.search(r''初始状态.*=\s*“({.*})”,数据)。组(1)
kuku=json.loads(re.sub)(?您可以通过使用“原始”字符串说明符“r”来简化正则表达式,例如r”字符串"
顺便说一句。这将减少计算反斜杠的问题。@Todd将很高兴看到你将如何做?因为我仍然没有抓住要点。@MuhammadAkram你欢迎兄弟。如果我的答案满足你的需要,请在答案旁边勾选nike标记:Dhi兄弟!我需要更多帮助。我遵循你的代码对于其他我想抓取的东西,比如网站链接和ABN no,但它显示了一个错误,请检查代码并修复它:#ABN和网站ABN=soup.findall(“div”,{'class':'box_uuuuu-box-sc-1u3aqjl-0 kxddET')。查找('a')print(ABN)网站=soup.findall(“div”,{'class':'sc aykcc col sc-15n4ng3-0 hjfjkgg')).find('a')print(website)@MuhammadAkram我已经测试了我这边的代码,它运行得很好。你是在其他url上运行它吗?soup.findAll(“script”)[5]。text
然后re.findAll(r'''u INITIAL\u STATE'=\s*({.*}),变量)