Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/loops/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何创建一个循环,使用BeautifulSoup从源url中刮取多个页面?_Python_Loops_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 如何创建一个循环,使用BeautifulSoup从源url中刮取多个页面?

Python 如何创建一个循环,使用BeautifulSoup从源url中刮取多个页面?,python,loops,web-scraping,beautifulsoup,Python,Loops,Web Scraping,Beautifulsoup,当前脚本只允许我抓取一个页面,但我想从源url抓取所有5个页面。如何循环/迭代剩余的4页? 输出: Page - 1 AECC Aviation Power Co Ltd SHG:600893 53.3 Airbus SE PAR:AIR 30.3 Aselsan Elektronik Sanayi ve Ticaret AS IST:ASELS 31.6 AVIC Aircraft Co., Ltd. SHE:000768 54.4 AVIC Shenyang Aircraft Co. Ltd

当前脚本只允许我抓取一个页面,但我想从源url抓取所有5个页面。如何循环/迭代剩余的4页?

输出:

Page - 1
AECC Aviation Power Co Ltd SHG:600893 53.3
Airbus SE PAR:AIR 30.3
Aselsan Elektronik Sanayi ve Ticaret AS IST:ASELS 31.6
AVIC Aircraft Co., Ltd. SHE:000768 54.4
AVIC Shenyang Aircraft Co. Ltd. SHG:600760 51.3
AviChina Industry & Technology Company Limited HKG:2357 45.2
BAE Systems PLC LON:BA 34.3
Bombardier Inc. TSE:BBD.B 30
BWX Technologies, Inc. NYS:BWXT 42.3
CAE Inc. TSE:CAE 32.4
------------------------------------------------------------------------------------------
Page - 2
China Avionics Systems Co.,Ltd. SHG:600372 54.8
Cobham PLC LON:COB 34.7
Curtiss-Wright Corp NYS:CW 39
Dassault Aviation S.A. PAR:AM 31.8
Embraer S.A. BSP:EMBR3 36.3
FACC AG WBO:FACC 37.9
General Dynamics Corp NYS:GD 37.5
Heico Corp NYS:HEI 39.3
Hexcel Corp NYS:HXL 31.6
Huntington Ingalls Industries, Inc. NYS:HII 41.3
------------------------------------------------------------------------------------------
Page - 3
Kongsberg Gruppen ASA OSL:KOG 29
Korea Aerospace Industries, Ltd. KRX:047810 49.9
L3Harris Technologies, Inc. NYS:LHX 38.8
Leonardo S.p.a. MIL:LDO 28.7
Lockheed Martin Corp NYS:LMT 30.6
Macquarie Infrastructure Corp NYS:MIC 44.7
Meggitt PLC LON:MGGT 32.7
MTU Aero Engines AG ETR:MTX 23.8
Northrop Grumman Corp. NYS:NOC 31.1
QinetiQ Group PLC LON:QQ 23
------------------------------------------------------------------------------------------
Page - 4
Raytheon Co NYS:RTN 32.9
Rheinmetall AG ETR:RHM 35.4
Rolls-Royce Holdings PLC LON:RR 28.6
Saab AB OME:SAAB.B 31.5
Safran SA PAR:SAF 30.7
Senior PLC LON:SNR 31.9
Signature Aviation Plc LON:SIG 35.4
Singapore Technologies Engineering Ltd. SES:S63 29.2
Spirit AeroSystems Holdings Inc NYS:SPR 36.8
Teledyne Technologies, Inc. NYS:TDY 37.5
------------------------------------------------------------------------------------------
Page - 5
Textron Inc. NYS:TXT 37.8
Thales SA PAR:HO 28.6
The Boeing Company NYS:BA 39
TransDigm Group Inc NYS:TDG 40.9
Ultra Electronics Holdings PLC LON:ULE 37.4
United Technologies Corp NYS:UTX 29.3
------------------------------------------------------------------------------------------
公司名称公司交易所公司风险

AECC航空动力有限公司SHG:600893 53.3

空客SE标准杆:空中30.3杆

Aselsan Elektronik Sanayi ve Ticaret AS IST:ASELS 31.6

中航飞机股份有限公司SHE:000768 54.4

中航沈阳飞机有限公司SHG:60076051.3

中航工业科技有限公司香港:2357 45.2

BAE系统有限公司LON:BA 34.3

庞巴迪公司TSE:BBD.B 30

BWX技术公司纽约证券交易所:BWXT 42.3


CAE Inc.TSE:CAE 32.4

为循环放置一个
,并使用循环不变量来构造url和文件名

#Import Libraries
from bs4 import BeautifulSoup
import requests
import csv

pages = 5
for i in range(1, pages+1):
    print(f"Page - {i}")
    source = requests.get(f'https://www.sustainalytics.com/esg-ratings/?industry=Aerospace%20&%20Defense&currentpage={i}').text
    soup = BeautifulSoup(source, 'lxml')

    #Start CSV
    csv_file = open(f'aerospacedata_{i}.csv', 'w')
    csv_writer = csv.writer(csv_file)
    csv_writer.writerow(['company_name', 'company_exchange', 'company_risk'])

    #Scrape Data from Web and write to csv
    for company_info in soup.find_all(class_='company-row d-flex'):
        company_name = company_info.a.text
        company_exchange = company_info.find("small").text
        company_risk = company_info.find("div", class_="col-2").text
        print(company_name, company_exchange,company_risk)
        csv_writer.writerow([company_name, company_exchange, company_risk])
    csv_file.close()

    print("---" * 30)

输出:

Page - 1
AECC Aviation Power Co Ltd SHG:600893 53.3
Airbus SE PAR:AIR 30.3
Aselsan Elektronik Sanayi ve Ticaret AS IST:ASELS 31.6
AVIC Aircraft Co., Ltd. SHE:000768 54.4
AVIC Shenyang Aircraft Co. Ltd. SHG:600760 51.3
AviChina Industry & Technology Company Limited HKG:2357 45.2
BAE Systems PLC LON:BA 34.3
Bombardier Inc. TSE:BBD.B 30
BWX Technologies, Inc. NYS:BWXT 42.3
CAE Inc. TSE:CAE 32.4
------------------------------------------------------------------------------------------
Page - 2
China Avionics Systems Co.,Ltd. SHG:600372 54.8
Cobham PLC LON:COB 34.7
Curtiss-Wright Corp NYS:CW 39
Dassault Aviation S.A. PAR:AM 31.8
Embraer S.A. BSP:EMBR3 36.3
FACC AG WBO:FACC 37.9
General Dynamics Corp NYS:GD 37.5
Heico Corp NYS:HEI 39.3
Hexcel Corp NYS:HXL 31.6
Huntington Ingalls Industries, Inc. NYS:HII 41.3
------------------------------------------------------------------------------------------
Page - 3
Kongsberg Gruppen ASA OSL:KOG 29
Korea Aerospace Industries, Ltd. KRX:047810 49.9
L3Harris Technologies, Inc. NYS:LHX 38.8
Leonardo S.p.a. MIL:LDO 28.7
Lockheed Martin Corp NYS:LMT 30.6
Macquarie Infrastructure Corp NYS:MIC 44.7
Meggitt PLC LON:MGGT 32.7
MTU Aero Engines AG ETR:MTX 23.8
Northrop Grumman Corp. NYS:NOC 31.1
QinetiQ Group PLC LON:QQ 23
------------------------------------------------------------------------------------------
Page - 4
Raytheon Co NYS:RTN 32.9
Rheinmetall AG ETR:RHM 35.4
Rolls-Royce Holdings PLC LON:RR 28.6
Saab AB OME:SAAB.B 31.5
Safran SA PAR:SAF 30.7
Senior PLC LON:SNR 31.9
Signature Aviation Plc LON:SIG 35.4
Singapore Technologies Engineering Ltd. SES:S63 29.2
Spirit AeroSystems Holdings Inc NYS:SPR 36.8
Teledyne Technologies, Inc. NYS:TDY 37.5
------------------------------------------------------------------------------------------
Page - 5
Textron Inc. NYS:TXT 37.8
Thales SA PAR:HO 28.6
The Boeing Company NYS:BA 39
TransDigm Group Inc NYS:TDG 40.9
Ultra Electronics Holdings PLC LON:ULE 37.4
United Technologies Corp NYS:UTX 29.3
------------------------------------------------------------------------------------------

很好,非常感谢你的帮助!