Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/javascript/433.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/299.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用JavaScript或Python抓取HTML数据_Javascript_Python_Web Scraping_Beautifulsoup - Fatal编程技术网

使用JavaScript或Python抓取HTML数据

使用JavaScript或Python抓取HTML数据,javascript,python,web-scraping,beautifulsoup,Javascript,Python,Web Scraping,Beautifulsoup,我想从html中抓取数据并将其保存到我有URL的文本文件中 你能帮我吗 如果JavaScript或python更好,我可以试试 #导入库 导入请求 导入urllib2 从bs4导入时间 进口美联 #设置要从中浏览的URL url='theweathernetwork.com/ca/hourly weatherforecast/antio/london' #连接到URL response=requests.get(url) #解析HTML并保存到BeautifulSoup对象 soup=Beaut

我想从html中抓取数据并将其保存到我有URL的文本文件中 你能帮我吗 如果JavaScript或python更好,我可以试试

#导入库
导入请求
导入urllib2
从bs4导入时间
进口美联
#设置要从中浏览的URL
url='theweathernetwork.com/ca/hourly weatherforecast/antio/london'
#连接到URL
response=requests.get(url)
#解析HTML并保存到BeautifulSoup对象
soup=BeautifulSoup(response.text,“html.parser”)
#为了下载整个数据集,让我们对所有a标记进行for循环

您可以从api(通过查看DevTools->XHR)获取数据。但是,如果您试图获取html,则需要使用selenium之类的工具,以便页面能够呈现,然后获取html源代码

所以,不确定这是否是你想要的,但数据在那里,你可以拉任何你想拉的东西。下面是一个小时数据的示例

import requests
import pandas as pd

url = 'https://www.theweathernetwork.com/api/data/caon0383/hourly/cm/ci?ts=1012'

jsonData = requests.get(url).json()

df = pd.DataFrame(jsonData['hourly']['periods'])
输出:

print (df.to_string())
                   b cc_class                  cdate  cloud_coverage dayname_alt dewpt_unit   dn   f  fc feelsLikeNight_unit fu   hour       ic icon              ii                        it   ms   n pop_class  pp      r  rain_bar_height rain_unit_language rain_value  rr  ru      s         sd  showrainunit  showsnowunit sky_tenths  snow_bar_height snow_unit_language snow_value  sr  su   t  tc tmau tmu            tsg            tsl tu   w  wd  wg wgk   wgu  wk    wu   wx
0            default      cc9    Tuesday, November 5              90         Tue          C  Tue   0   0                   C  C   6 am    sunny    8       chart-sun             Mainly cloudy  Nov   1      pop3  30      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          9                0                NaN          -   0  cm   4   4    C   C  1572951600000  1572933600000  C  14   W  21  21  km/h  14  km/h  O-N
1            default      cc7    Tuesday, November 5              70         Tue          C  Tue   0   0                   C  C   7 am    sunny    3    chart-stormy   A mix of sun and clouds  Nov   2      pop3  30      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          7                0                NaN          -   0  cm   4   4    C   C  1572955200000  1572937200000  C  16   W  24  24  km/h  16  km/h    B
2            default      cc7    Tuesday, November 5              70         Tue          C  Tue   0   0                   C  C   8 am    sunny    3    chart-stormy   A mix of sun and clouds  Nov   3      pop3  30      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          7                0                NaN          -   0  cm   4   4    C   C  1572958800000  1572940800000  C  20   W  29  29  km/h  20  km/h    B
3            default      cc7    Tuesday, November 5              70         Tue          C  Tue   0   0                   C  C   9 am    sunny    3    chart-stormy   A mix of sun and clouds  Nov   4      pop3  30      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          7                0                NaN          -   0  cm   4   4    C   C  1572962400000  1572944400000  C  22   W  33  33  km/h  22  km/h    B
4            default      cc7    Tuesday, November 5              70         Tue          C  Tue   1   1                   C  C  10 am    sunny    3    chart-stormy   A mix of sun and clouds  Nov   5      pop2  20      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          7                0                NaN          -   0  cm   5   5    C   C  1572966000000  1572948000000  C  24   W  37  37  km/h  24  km/h    B
5            default      cc6    Tuesday, November 5              60         Tue          C  Tue   0   0                   C  C  11 am    sunny    3    chart-stormy   A mix of sun and clouds  Nov   6      pop2  20      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          6                0                NaN          -   0  cm   5   5    C   C  1572969600000  1572951600000  C  28   W  42  42  km/h  28  km/h    B
6            default      cc6    Tuesday, November 5              60         Tue          C  Tue   1   1                   C  C  12 pm    sunny    3    chart-stormy   A mix of sun and clouds  Nov   7      pop2  20      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          6                0                NaN          -   0  cm   6   6    C   C  1572973200000  1572955200000  C  29   W  44  44  km/h  29  km/h    B
7            default      cc6    Tuesday, November 5              60         Tue          C  Tue   3   3                   C  C   1 pm    sunny    3    chart-stormy   A mix of sun and clouds  Nov   8      pop2  20      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          6                0                NaN          -   0  cm   7   7    C   C  1572976800000  1572958800000  C  29   W  44  44  km/h  29  km/h    B
8            default      cc6    Tuesday, November 5              60         Tue          C  Tue   2   2                   C  C   2 pm    sunny    3    chart-stormy   A mix of sun and clouds  Nov   9      pop2  20      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          6                0                NaN          -   0  cm   6   6    C   C  1572980400000  1572962400000  C  27   W  41  41  km/h  27  km/h    B
9            default      cc7    Tuesday, November 5              70         Tue          C  Tue   2   2                   C  C   3 pm    sunny    3    chart-stormy   A mix of sun and clouds  Nov  10      pop3  30      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          7                0                NaN          -   0  cm   6   6    C   C  1572984000000  1572966000000  C  24   W  36  36  km/h  24  km/h    B
10           default      cc8    Tuesday, November 5              80         Tue          C  Tue   1   1                   C  C   4 pm    sunny    4  chart-overcast  Cloudy with sunny breaks  Nov  11      pop3  30      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          8                0                NaN          -   0  cm   5   5    C   C  1572987600000  1572969600000  C  23   W  35  35  km/h  23  km/h   B+
11           default      cc7    Tuesday, November 5              70         Tue          C  Tue   1   1                   C  C   5 pm    sunny   20  chart-overcast             Partly cloudy  Nov  12      pop3  30      -                0                NaN          -   0  mm      -  Tue Nov 5         False         False          7                0                NaN          -   0  cm   5   5    C   C  1572991200000  1572973200000  C  23   W  35  35  km/h  23  km/h   BN
12           ...

你能展示一下你已经尝试过的东西吗?我还没有尝试过任何东西。我很好。我可以给你发送我想提取的url和数据。首先,研究一下你的问题。然后,如果您没有找到答案,请在这里用您尝试过的一些代码提问。alreadyHey@MaryMekhael,欢迎使用stackoverflow!请看一看。你应该试着更准确地描述一下你想做的是什么。我试过了,但不起作用。我在安装Pandas时也遇到了问题。什么不起作用?您收到了什么错误消息?你说的“熊猫安装有困难”是什么意思?您需要提供详细信息,否则很难排除故障并提供解决方案。如果您在安装Pandas时遇到问题,则可能存在更大的潜在问题。熊猫是一个广泛使用和支持的软件包。您甚至可能不需要使用熊猫,但您最初的问题并没有确切说明您希望输出什么。我只是假设你想要一个包含数据的表格,但你可以简单地将json写入文件。很抱歉,我只是用我的url替换了你的url,但是我收到了一个错误,询问我没有安装pandas,是否可以使用任何其他语言,如javascript?我需要的是文本文件中的数据,最后可能也是vbs,但我找不到任何解决方案我正在使用的这个网站我去了,我使用了不同的方法/脚本语言我现在使用的是vbs。我现在唯一坚持的部分是Mtime1=objIE.document.GetElementsByCassName(“wxperiod temp”)(0)。innerHtml显然不允许使用2个类名,我想知道如何解决这个问题,还是它是我的合成HTML:2我想得到“2”?这就是我现在遇到的问题