如何用Python从NHC网站提取表?

如何用Python从NHC网站提取表?,python,python-3.x,web-scraping,beautifulsoup,Python,Python 3.x,Web Scraping,Beautifulsoup,这里, “数据和产品”部分下有一个表格。我想提取表并将其保存到CSV文件中。我编写了以下基本代码: https://www.nhc.noaa.gov/gis/ 我只知道刮削的基本知识。请引导我离开这里。谢谢 你可以用熊猫 from bs4 import BeautifulSoup import requests page = requests.get("https://www.nhc.noaa.gov/gis/") soup = BeautifulSoup(page.con

这里,

“数据和产品”部分下有一个表格。我想提取表并将其保存到CSV文件中。我编写了以下基本代码:

https://www.nhc.noaa.gov/gis/
我只知道刮削的基本知识。请引导我离开这里。谢谢

你可以用熊猫

from bs4 import BeautifulSoup
import requests
page = requests.get("https://www.nhc.noaa.gov/gis/")
soup = BeautifulSoup(page.content, 'html.parser')
print(soup)

很难知道,但我想这就是你想要的:

import pandas as pd

url = 'https://www.nhc.noaa.gov/gis/'
df = pd.read_html(url)[0]

# create csv file
df.to_csv("mycsv.csv")
印刷品:

from bs4 import BeautifulSoup
import requests

r = requests.get('https://www.nhc.noaa.gov/gis/')

soup = BeautifulSoup(r.content, 'html.parser')

for a in soup.find_all('a'):
    if a.get('href'):
        if '.' in a.get('href').split('/')[-1]\
                and 'html' not in a.get('href')\
                and '.php' not in a.get('href')\
                and 'http' not in a.get('href')\
                and 'mailto' not in a.get('href'):
            print('https://www.nhc.noaa.gov' + a.get('href'))

。。依此类推……

您想提取表本身还是表链接的数据?我想提取表内的数据。谢谢指点!好的,那么你想下载zip和其他数据类型并用它创建一个表吗?是的,我想下载zip和表中的其他文件。谢谢回答@dimay。我的目标是提取表中的数据,如zip、kmz、shp文件和其他链接。请在这方面指导我。如果你想保存表格,我希望以表格的形式输出@dimay。但是,该表缺少的是链接。它只有文本。有没有办法让表中的链接正常工作?
https://www.nhc.noaa.gov/gis/examples/al112017_5day_020.zip
https://www.nhc.noaa.gov/gis/examples/AL112017_020adv_CONE.kmz
https://www.nhc.noaa.gov/gis/examples/AL112017_020adv_TRACK.kmz
https://www.nhc.noaa.gov/gis/examples/AL112017_020adv_WW.kmz
https://www.nhc.noaa.govforecast/archive/al092020_5day_latest.zip
https://www.nhc.noaa.gov/storm_graphics/api/AL092020_CONE_latest.kmz
https://www.nhc.noaa.gov/storm_graphics/api/AL092020_TRACK_latest.kmz
https://www.nhc.noaa.gov/storm_graphics/api/AL092020_WW_latest.kmz
https://www.nhc.noaa.govforecast/archive/al102020_5day_latest.zip
https://www.nhc.noaa.gov/storm_graphics/api/AL102020_CONE_latest.kmz
https://www.nhc.noaa.gov/storm_graphics/api/AL102020_TRACK_latest.kmz
https://www.nhc.noaa.gov/storm_graphics/api/AL102020_WW_latest.kmz
https://www.nhc.noaa.gov/gis/examples/al112017_fcst_020.zip
https://www.nhc.noaa.gov/gis/examples/AL112017_initialradii_020adv.kmz
https://www.nhc.noaa.gov/gis/examples/AL112017_forecastradii_020adv.kmz
https://www.nhc.noaa.govforecast/archive/al092020_fcst_latest.zip
https://www.nhc.noaa.gov/storm_graphics/api/AL092020_initialradii_latest.kmz
https://www.nhc.noaa.gov/storm_graphics/api/AL092020_forecastradii_latest.kmz
https://www.nhc.noaa.govforecast/archive/al102020_fcst_latest.zip