Python 使用urllib2从FlightRadar24获取数据时出现问题

Python 使用urllib2从FlightRadar24获取数据时出现问题,python,json,url,Python,Json,Url,我正在尝试使用下面的脚本从FlightRadar24获取数据,基于来处理cookie。当我当前在浏览器中键入该url时,我会得到一个很长的json或字典,其中包含一个lat/long/alt更新列表。但是当我尝试下面的代码时,我得到下面列出的错误消息 要成功地将json读入python,我需要做什么 注意:该链接可能在一两周内停止工作-它们不会永远提供数据 import urllib2 import cookielib jar = cookielib.FileCookieJar("cooki

我正在尝试使用下面的脚本从FlightRadar24获取数据,基于来处理cookie。当我当前在浏览器中键入该url时,我会得到一个很长的json或字典,其中包含一个lat/long/alt更新列表。但是当我尝试下面的代码时,我得到下面列出的错误消息

要成功地将json读入python,我需要做什么

注意:该链接可能在一两周内停止工作-它们不会永远提供数据

import urllib2 
import cookielib

jar = cookielib.FileCookieJar("cookies")
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
url = "http://lhr.data.fr24.com/_external/planedata_json.1.3.php?f=72c5ef5"

response = opener.open(url)
print response.headers
print "Got page"
print "Currently have %d cookies" % len(jar)
print jar
import urllib2
import json
import numpy as np
import matplotlib.pyplot as plt

# FROM this question: https://stackoverflow.com/a/32163003
# and THIS ANSWER: https://stackoverflow.com/a/32163003/3904031
# and a little from here: https://stackoverflow.com/a/6826511

url        = "http://lhr.data.fr24.com/_external/planedata_json.1.3.php?f=72c5ef5"

req        = urllib2.Request(url, headers={"Connection":"keep-alive", "User-Agent":"Mozilla/5.0"})

response   = urllib2.urlopen(req)

the_dict   = json.loads(response.read())

trail      = the_dict['trail']

trailarray = np.array(trail)


s0, s1 = len(trailarray)/3, 3

lat, lon, alt = trailarray[:s0*s1].reshape(s0,s1).T

alt *= 10.  # they drop the last zero


# plot raw data of the trail. Note there are gaps - no time information here

plt.figure()

plt.subplot(2,2,1)

plt.plot(lat)
plt.hold
plt.plot(lon)
plt.title('raw lat lon')

plt.subplot(2,2,3)
plt.plot(alt)
plt.title('raw alt')

plt.subplot(1,2,2)
plt.plot(lon, lat)
plt.title('raw lat vs lon')
plt.text(-40, 46, "this segment is")
plt.text(-40, 45.5, "transatlantic")
plt.text(-40, 45, "gap in data")

plt.savefig('raw lat lon alt')
plt.show()
回溯(最近一次呼叫最后一次): 文件“[mypath]/test v00.py”,第8行,在 响应=打开器。打开(链接) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第410行,打开 响应=方法(请求,响应) http_响应中的文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第523行 “http”、请求、响应、代码、消息、hdrs) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第448行出错 返回自我。调用链(*args) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第382行,在调用链中 结果=func(*args) 文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第531行,http\u error\u default raise HTTPError(请求获取完整url(),代码,消息,hdrs,fp)
HTTPError:HTTP错误403:Forbidded

我不确定您需要cookie做什么,但问题是Web服务器正在阻止对请求头中urllib发送的用户代理的访问(这类似于-
'Python-urllib/2.7'
等等)

您应该向标头添加有效的浏览器用户代理以获取正确的数据。范例-

import urllib2
url = "http://lhr.data.fr24.com/_external/planedata_json.1.3.php?f=72c5ef5"
req = urllib2.Request(url, headers={"Connection":"keep-alive", "User-Agent":"Mozilla/5.0"})
response = urllib2.urlopen(req)
jsondata = response.read()
@AnandSKumar给出的第一个答案是可以接受的,但这里还有几行是有用的,因为
jsondata=response.read()
返回一个字符串

注意:该链接可能在一两周内停止工作-它们不会永远提供数据

import urllib2 
import cookielib

jar = cookielib.FileCookieJar("cookies")
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
url = "http://lhr.data.fr24.com/_external/planedata_json.1.3.php?f=72c5ef5"

response = opener.open(url)
print response.headers
print "Got page"
print "Currently have %d cookies" % len(jar)
print jar
import urllib2
import json
import numpy as np
import matplotlib.pyplot as plt

# FROM this question: https://stackoverflow.com/a/32163003
# and THIS ANSWER: https://stackoverflow.com/a/32163003/3904031
# and a little from here: https://stackoverflow.com/a/6826511

url        = "http://lhr.data.fr24.com/_external/planedata_json.1.3.php?f=72c5ef5"

req        = urllib2.Request(url, headers={"Connection":"keep-alive", "User-Agent":"Mozilla/5.0"})

response   = urllib2.urlopen(req)

the_dict   = json.loads(response.read())

trail      = the_dict['trail']

trailarray = np.array(trail)


s0, s1 = len(trailarray)/3, 3

lat, lon, alt = trailarray[:s0*s1].reshape(s0,s1).T

alt *= 10.  # they drop the last zero


# plot raw data of the trail. Note there are gaps - no time information here

plt.figure()

plt.subplot(2,2,1)

plt.plot(lat)
plt.hold
plt.plot(lon)
plt.title('raw lat lon')

plt.subplot(2,2,3)
plt.plot(alt)
plt.title('raw alt')

plt.subplot(1,2,2)
plt.plot(lon, lat)
plt.title('raw lat vs lon')
plt.text(-40, 46, "this segment is")
plt.text(-40, 45.5, "transatlantic")
plt.text(-40, 45, "gap in data")

plt.savefig('raw lat lon alt')
plt.show()
要将时间和日期信息转换为人形,请执行以下操作:

def humanize(seconds_since_epoch):
    """ from https://stackoverflow.com/a/15953715/3904031 """
    return datetime.datetime.fromtimestamp(seconds_since_epoch).strftime('%Y-%m-%d %H:%M:%S')

import datetime
humanize(the_dict['arrival'])
返回

'2015-08-20 17:43:50'

太好了!这很有效,非常感谢!是时候了解一下用户代理了。当我的第一次尝试导致消息抱怨cookie未启用时,我添加了cookie处理。我想知道这里的用户代理是否只是自动接受cookies。我不这么认为,您可能还做了其他事情导致了该错误,不确定。这是很有可能的。感谢您的快速回复@AnandSKumar!