Python 是否可以在范围内刮取属性?
所以我想从一个网站上搜集一些电话号码。唯一的问题是,它们隐藏在单击之后。我无法点击所有这些按钮,使它们可以刮取,因此我想问是否有任何方法可以从span标记内的“数据电话”属性中获取它们 我试着用数据电话,但没用Python 是否可以在范围内刮取属性?,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,所以我想从一个网站上搜集一些电话号码。唯一的问题是,它们隐藏在单击之后。我无法点击所有这些按钮,使它们可以刮取,因此我想问是否有任何方法可以从span标记内的“数据电话”属性中获取它们 我试着用数据电话,但没用 from bs4 import BeautifulSoup import requests import csv source = requests.get('https://software-overzicht.nl/amersfoort?page=1').text soup =
from bs4 import BeautifulSoup
import requests
import csv
source = requests.get('https://software-overzicht.nl/amersfoort?page=1').text
soup = BeautifulSoup(source, 'lxml')
csv_file = open('cms_scrape.csv', 'w')
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['title, location'])
for number in soup.find_all('span', data_='data-phone'):
print(number)
for info in soup.find_all('div', class_='company-info-top'):
title = info.a.text
location = info.p.text
csv_writer.writerow([title, location])
csv_file.close()
改变
for number in soup.find_all('span', data_='data-phone'):
print(number)
到
输出:
0334226800
0878739737
0334558584
0334798200
0334720311
0334677050
0334554948
0334535384
0337767840
0334560292
0626214363
0334559065
0334506506
0620423525
0334556166
0332012581
0334557485
0334946111
0334536200
0334545111
0334545430
0337851805
033-4721544
06-26662490
要将其合并到csv中,请执行以下操作:
from bs4 import BeautifulSoup
import requests
import csv
with open('C:/cms_scrape.csv','w', newline='') as f:
csv_writter = csv.writer(f)
csv_writter.writerow(['naambedrijf', 'adress', 'phone'])
for page in range(1, 22):
url = 'https://software-overzicht.nl/amersfoort?page={}'.format(page)
source = requests.get(url).text
soup = BeautifulSoup(source, 'lxml')
for search in soup.find_all('div', class_='company-info-top'):
title = search.a.text.strip()
adress = search.p.text.strip()
try:
phone = search.find('span', {'class':'phone'})['data-phone']
except:
phone = 'N/A'
print(title)
csv_writter.writerow([title,adress,phone])
from bs4 import BeautifulSoup
import requests
import csv
with open('C:/cms_scrape.csv','w', newline='') as f:
csv_writter = csv.writer(f)
csv_writter.writerow(['naambedrijf', 'adress', 'phone'])
for page in range(1, 22):
url = 'https://software-overzicht.nl/amersfoort?page={}'.format(page)
source = requests.get(url).text
soup = BeautifulSoup(source, 'lxml')
for search in soup.find_all('div', class_='company-info-top'):
title = search.a.text.strip()
adress = search.p.text.strip()
try:
phone = search.find('span', {'class':'phone'})['data-phone']
except:
phone = 'N/A'
print(title)
csv_writter.writerow([title,adress,phone])