如何使用Python从Met Office JSON下载中提取数据

如何使用Python从Met Office JSON下载中提取数据,python,json,weather,Python,Json,Weather,我正在使用Python 3.4 我已经启动了一个项目,下载英国气象局的天气预报数据(JSON格式),并将这些信息用作我的家庭供暖系统的天气补偿器。我已经成功地从MET Office下载了JSON数据文件,现在我想提取我需要的信息。我可以通过将文件转换为字符串并使用.find和.int方法来提取数据来实现这一点,但这看起来很粗糙(但很有效)。由于JSON被认为是一种使用良好的数据交换格式,因此必须有更好的方法来实现这一点。我发现了类似于json.load和json.load,以及json.json

我正在使用Python 3.4

我已经启动了一个项目,下载英国气象局的天气预报数据(JSON格式),并将这些信息用作我的家庭供暖系统的天气补偿器。我已经成功地从MET Office下载了JSON数据文件,现在我想提取我需要的信息。我可以通过将文件转换为字符串并使用
.find
.int
方法来提取数据来实现这一点,但这看起来很粗糙(但很有效)。由于JSON被认为是一种使用良好的数据交换格式,因此必须有更好的方法来实现这一点。我发现了类似于
json.load
json.load
,以及
json.jsondeconder.decode
的东西,但我在使用这些方面没有任何成功,我真的不知道我在做什么

我的代码是:

import urllib.request
import json

#Comment:  THIS IS THE CALL TO GET THE MET OFFICE FILE FROM THE INTERNET
#Comment:  **** = my personal met office API key, which I had better keep to myself

response = urllib.request.urlopen('http://datapoint.metoffice.gov.uk/public/data/val/wxfcs/all/json/354037?res=3hourly&key=****')

FCData    = response.read()
FCDataStr = str(FCData)

#Comment:   END OF THE CALL TO GET MET OFFICE FILE FROM THE INTERNET
#Comment:   Example of data extraction

ChPos = FCDataStr.find('"DV"')      #Find "DV"    
ChPos = FCDataStr.find('"dataDate"', ChPos, ChPos+50)      #Find "dataDate"

FileDataDate = FCDataStr[ChPos+12:ChPos+22]                #Extract the date of the file

#Comment:   And so on
使用
json.loads(FCDataStr)
时,我收到以下错误消息:

“ValueError:应为值:第1行第1列(字符0)”

通过删除开头的“b”和结尾的“b”,此错误消失(见下文)。使用
print(FCDataStr)
以字符串格式打印JSON文件给出:

b'{"SiteRep":{"Wx":{"Param":[{"name":"F","units":"C","$":"Feels Like Temperature"},{"name":"G","units":"mph","$":"Wind Gust"},{"name":"H","units":"%","$":"Screen Relative Humidity"},{"name":"T","units":"C","$":"Temperature"},{"name":"V","units":"","$":"Visibility"},{"name":"D","units":"compass","$":"Wind Direction"},{"name":"S","units":"mph","$":"Wind Speed"},{"name":"U","units":"","$":"Max UV Index"},{"name":"W","units":"","$":"Weather Type"},{"name":"Pp","units":"%","$":"Precipitation Probability"}]},"DV":{"dataDate":"2014-07-29T20:00:00Z","type":"Forecast","Location":{"i":"354037","lat":"51.7049","lon":"-2.9022","name":"USK","country":"WALES","continent":"EUROPE","elevation":"43.0","Period":[{"type":"Day","value":"2014-07-29Z","Rep":[{"D":"NNW","F":"22","G":"11","H":"51","Pp":"4","S":"9","T":"24","V":"VG","W":"7","U":"7","$":"900"},{"D":"NW","F":"19","G":"16","H":"61","Pp":"8","S":"11","T":"22","V":"EX","W":"8","U":"1","$":"1080"},{"D":"NW","F":"16","G":"20","H":"70","Pp":"1","S":"11","T":"18","V":"VG","W":"2","U":"0","$":"1260"}]},{"type":"Day","value":"2014-07-30Z","Rep":[{"D":"NW","F":"13","G":"16","H":"84","Pp":"0","S":"7","T":"14","V":"VG","W":"0","U":"0","$":"0"},{"D":"WNW","F":"12","G":"13","H":"90","Pp":"0","S":"7","T":"13","V":"VG","W":"0","U":"0","$":"180"},{"D":"WNW","F":"13","G":"11","H":"87","Pp":"0","S":"7","T":"14","V":"GO","W":"1","U":"1","$":"360"},{"D":"SW","F":"18","G":"9","H":"67","Pp":"0","S":"4","T":"19","V":"VG","W":"1","U":"2","$":"540"},{"D":"WNW","F":"21","G":"13","H":"56","Pp":"0","S":"9","T":"22","V":"VG","W":"3","U":"6","$":"720"},{"D":"W","F":"21","G":"20","H":"55","Pp":"0","S":"11","T":"23","V":"VG","W":"3","U":"6","$":"900"},{"D":"W","F":"18","G":"22","H":"57","Pp":"0","S":"11","T":"21","V":"VG","W":"1","U":"2","$":"1080"},{"D":"WSW","F":"16","G":"13","H":"80","Pp":"0","S":"7","T":"16","V":"VG","W":"0","U":"0","$":"1260"}]},{"type":"Day","value":"2014-07-31Z","Rep":[{"D":"SW","F":"14","G":"11","H":"91","Pp":"0","S":"4","T":"15","V":"GO","W":"0","U":"0","$":"0"},{"D":"SW","F":"14","G":"11","H":"92","Pp":"0","S":"4","T":"14","V":"GO","W":"0","U":"0","$":"180"},{"D":"SW","F":"15","G":"11","H":"89","Pp":"3","S":"7","T":"16","V":"GO","W":"3","U":"1","$":"360"},{"D":"WSW","F":"17","G":"20","H":"79","Pp":"28","S":"11","T":"18","V":"GO","W":"3","U":"2","$":"540"},{"D":"WSW","F":"18","G":"22","H":"72","Pp":"34","S":"11","T":"20","V":"GO","W":"10","U":"5","$":"720"},{"D":"WSW","F":"18","G":"22","H":"66","Pp":"13","S":"11","T":"20","V":"VG","W":"7","U":"5","$":"900"},{"D":"WSW","F":"17","G":"22","H":"69","Pp":"36","S":"11","T":"19","V":"VG","W":"10","U":"2","$":"1080"},{"D":"WSW","F":"16","G":"16","H":"84","Pp":"6","S":"9","T":"17","V":"GO","W":"2","U":"0","$":"1260"}]},{"type":"Day","value":"2014-08-01Z","Rep":[{"D":"SW","F":"16","G":"13","H":"91","Pp":"4","S":"7","T":"16","V":"GO","W":"7","U":"0","$":"0"},{"D":"SW","F":"15","G":"11","H":"93","Pp":"5","S":"7","T":"16","V":"GO","W":"7","U":"0","$":"180"},{"D":"SSW","F":"15","G":"11","H":"93","Pp":"7","S":"7","T":"16","V":"GO","W":"7","U":"1","$":"360"},{"D":"SSW","F":"17","G":"18","H":"79","Pp":"14","S":"9","T":"18","V":"GO","W":"7","U":"2","$":"540"},{"D":"SSW","F":"17","G":"22","H":"74","Pp":"43","S":"11","T":"19","V":"GO","W":"10","U":"5","$":"720"},{"D":"SW","F":"16","G":"22","H":"81","Pp":"48","S":"11","T":"18","V":"GO","W":"10","U":"5","$":"900"},{"D":"SW","F":"16","G":"18","H":"80","Pp":"55","S":"9","T":"17","V":"GO","W":"12","U":"1","$":"1080"},{"D":"SSW","F":"15","G":"16","H":"89","Pp":"38","S":"7","T":"16","V":"GO","W":"9","U":"0","$":"1260"}]},{"type":"Day","value":"2014-08-02Z","Rep":[{"D":"S","F":"14","G":"11","H":"94","Pp":"15","S":"7","T":"15","V":"GO","W":"7","U":"0","$":"0"},{"D":"SSE","F":"14","G":"11","H":"94","Pp":"16","S":"7","T":"15","V":"GO","W":"7","U":"0","$":"180"},{"D":"S","F":"14","G":"13","H":"93","Pp":"36","S":"7","T":"15","V":"GO","W":"10","U":"1","$":"360"},{"D":"S","F":"15","G":"20","H":"84","Pp":"62","S":"11","T":"17","V":"GO","W":"14","U":"2","$":"540"},{"D":"SSW","F":"16","G":"22","H":"78","Pp":"63","S":"11","T":"18","V":"GO","W":"14","U":"5","$":"720"},{"D":"WSW","F":"16","G":"27","H":"66","Pp":"59","S":"13","T":"19","V":"VG","W":"14","U":"5","$":"900"},{"D":"WSW","F":"15","G":"25","H":"68","Pp":"39","S":"13","T":"18","V":"VG","W":"10","U":"2","$":"1080"},{"D":"SW","F":"14","G":"16","H":"80","Pp":"28","S":"9","T":"15","V":"VG","W":"0","U":"0","$":"1260"}]}]}}}}'
使用的结果:

DecodedJSON = json.loads(FCDataStr)
print(DecodedJSON)
给出与原始FCDataStr文件非常相似的结果

如何从文件中提取数据(如每3小时预报的温度、风速等)

这就是问题所在:

FCDataStr = str(FCData)
当您对
bytes
对象调用
str
时,得到的是
bytes
对象的字符串表示形式,带有
b
前缀,并带有反斜杠转义的特殊字符

如果要将二进制数据解码为文本,必须使用以下方法:

(我猜是UTF-8,因为除非另有规定,否则JSON总是应该在UTF-8中。)


更详细地说:

返回一个类似于二进制文件的对象(实现)

您无法将其传递给,因为它需要一个类似对象的文本文件,该对象具有返回
str
read
方法,而不是
bytes
。您可以将您的
HTTPResponse
包装在
io.BufferedReader
中,然后包装在
io.TextIOBase
(使用
encoding='utf-8')
,然后将其传递到
json.load
,但这可能比您想要做的工作要多

因此,最简单的方法就是你想做的事情,只需使用
decode
而不是
str

data_bytes=response.read() data\u str=数据字节。解码('utf-8') data\u dict=json.load(data\u str)


然后,不要试图访问
data\u str
中的数据——这只是一个字符串,表示数据的JSON编码<代码>数据\u dict是实际数据

例如,要查找
SiteRep
DV
dataDate
,只需执行以下操作:

data_dict['SiteRep']['DV']['DataDate']
这将得到字符串“2014-07-31T14:00:00Z”。您可能仍然希望将其转换为
datetime.datetime
对象(因为JSON只理解一些基本类型:字符串、数字、列表和dict)。但这仍然比通过
find
-猜测或猜测偏移量从
数据中选择要好得多



我猜您已经找到了一些为Python2.x编写的示例代码,您可以通过调用适当的构造函数在字节字符串和Unicode字符串之间进行转换,而无需指定编码(默认为
sys.getdefaultencoding()
),通常(至少在Mac或大多数现代Linux发行版上)是UTF-8,所以,尽管它是错的,但它碰巧起了作用。在这种情况下,您可能希望找到一些更好的示例代码来学习……

对于其他可能希望使用英国气象局3小时预测数据源的无知人士,以下是我正在使用的解决方案:

import urllib.request
import json

###  THIS IS THE CALL TO GET THE MET OFFICE FILE FROM THE INTERNET
response = urllib.request.urlopen('http://datapoint.metoffice.gov.uk/public/data/val/wxfcs/all/json/**YourLocationID**?res=3hourly&key=**your_api_key**')
FCData = response.read()
FCDataStr = FCData.decode('utf-8')
###   END OF THE CALL TO GET MET OFFICE FILE FROM THE INTERNET

#Converts JSON data to a dictionary object
FCData_Dic = json.loads(FCDataStr)

#The following are examples of extracting data from the dictionary object.
#The JSON data is heavily nested.
#Each [] goes one level down, usually defined with {} in the JSON data.
dataDate = (FCData_Dic['SiteRep']['DV']['dataDate'])
print('dataDate =',dataDate)

#There are also [] in the JSON data, which are referenced with integers, 
# starting from [0]
#Here, the [0] refers to the first day's block of data defined with [].
DateDay0 = (FCData_Dic['SiteRep']['DV']['Location']['Period'][0]['value'])
print('DateDay0 =',DateDay0)

#The second [0] picks out each of the first day's forecast data, in this case the time, referenced by '$'
TimeOfFC = (FCData_Dic['SiteRep']['DV']['Location']['Period'][0]['Rep'][0]['$'])
print('TimeOfFC =',TimeOfFC)

#Ditto for the temperature.    
Temperature = int((FCData_Dic['SiteRep']['DV']['Location']['Period'][0]['Rep'][0]['T']))
print('Temperature =',Temperature)

#Ditto for the weather Type (a code number).
WeatherType = int((FCData_Dic['SiteRep']['DV']['Location']['Period'][0]['Rep'][0]['W']))
print('WeatherType =',WeatherType)

我希望这能帮助别人

我一直在分析Met Office数据点输出

感谢上面的回复,我有了一些适合我的东西

我正在将我感兴趣的数据写入CSV文件:

import sys
import os
import urllib.request
import json

###  THIS IS THE CALL TO GET THE MET OFFICE FILE FROM THE INTERNET
response = urllib.request.urlopen('http://datapoint.metoffice.gov.uk/public/data/val/wxobs/all/json/3351?res=hourly&?key=<my key>')
FCData = response.read()
FCDataStr = FCData.decode('utf-8')
###   END OF THE CALL TO GET MET OFFICE FILE FROM THE INTERNET

#Converts JSON data to a dictionary object
FCData_Dic = json.loads(FCDataStr)

# Open output file for appending
fName=<my filename>
if (not os.path.exists(fName)):
    print(fName,' does not exist')
    exit()
fOut=open(fName, 'a')

# Loop through each day, will nearly always be 2 days,
# unless run at midnight. 
i = 0
j = 0
for k in range(24):
    # there will be 24 values altogether
    # find the first hour value for the first day
    DateZ = (FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['value'])
    hhmm = (FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]  ['$'])
    Temperature = (FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]['T'])
    Humidity = (FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]['H'])
    DewPoint = (FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]['Dp'])
    recordStr = '{},{},{},{},{}\n'.format(DateZ,hhmm,Temperature,Humidity,DewPoint)
    fOut.write(recordStr)
    j = j + 1
    if (hhmm == '1380'):
        i = i + 1
        j = 0
fOut.close()
print('Records added to ',fName)`
导入系统 导入操作系统 导入urllib.request 导入json ###这是从互联网上获取大都会办公室文件的电话 response=urllib.request.urlopen('http://datapoint.metoffice.gov.uk/public/data/val/wxobs/all/json/3351?res=hourly&?key=') FCData=response.read() FCDataStr=FCData.decode('utf-8') ###从INTERNET获取MET OFFICE文件的呼叫结束 #将JSON数据转换为字典对象 FCData_Dic=json.load(FCDataStr) #打开要追加的输出文件 fName= 如果(不是os.path.exists(fName)): 打印(fName“不存在”) 退出() fOut=打开(fName,‘a’) #每天循环,几乎总是2天, #除非你在午夜跑步。 i=0 j=0 对于范围(24)内的k: #总共有24个值 #查找第一天的第一小时值 日期=(FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['value']) hhmm=(FCData[u Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]['$'] 温度=(FCData[u Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]['T'] 湿度=(FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]['H'] 露点=(FCData[u Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]['Dp'] recordStr='{},{},{},{},{}\n'.格式(日期、hhmm、温度、湿度、露点) fOut.write(recordStr) j=j+1 如果(hhmm==“1380”): i=i+1 j=0 fOut.close() 打印('记录添加到',fName)`
对不起,按回车键。我尝试了Qwerty=json.loads(FCData),得到了错误消息“TypeError:json对象必须是str,而不是“bytes”。还有Qwerty=json.load(response),它给出了错误TypeError:json对象必须是str,而不是“bytes”-这看起来很奇怪,因为.load是我认为的字节文件。感谢utf-
import sys
import os
import urllib.request
import json

###  THIS IS THE CALL TO GET THE MET OFFICE FILE FROM THE INTERNET
response = urllib.request.urlopen('http://datapoint.metoffice.gov.uk/public/data/val/wxobs/all/json/3351?res=hourly&?key=<my key>')
FCData = response.read()
FCDataStr = FCData.decode('utf-8')
###   END OF THE CALL TO GET MET OFFICE FILE FROM THE INTERNET

#Converts JSON data to a dictionary object
FCData_Dic = json.loads(FCDataStr)

# Open output file for appending
fName=<my filename>
if (not os.path.exists(fName)):
    print(fName,' does not exist')
    exit()
fOut=open(fName, 'a')

# Loop through each day, will nearly always be 2 days,
# unless run at midnight. 
i = 0
j = 0
for k in range(24):
    # there will be 24 values altogether
    # find the first hour value for the first day
    DateZ = (FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['value'])
    hhmm = (FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]  ['$'])
    Temperature = (FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]['T'])
    Humidity = (FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]['H'])
    DewPoint = (FCData_Dic['SiteRep']['DV']['Location']['Period'][i]['Rep'][j]['Dp'])
    recordStr = '{},{},{},{},{}\n'.format(DateZ,hhmm,Temperature,Humidity,DewPoint)
    fOut.write(recordStr)
    j = j + 1
    if (hhmm == '1380'):
        i = i + 1
        j = 0
fOut.close()
print('Records added to ',fName)`