Python 无法从普查API提取数据,因为它拒绝调用
我试图运行此脚本从美国人口普查中提取数据,但人口普查API拒绝了我的请求。这是拒绝我的拉,我做了一些工作,但我被难住了…有什么想法如何处理这个Python 无法从普查API提取数据,因为它拒绝调用,python,api,python-requests,Python,Api,Python Requests,我试图运行此脚本从美国人口普查中提取数据,但人口普查API拒绝了我的请求。这是拒绝我的拉,我做了一些工作,但我被难住了…有什么想法如何处理这个 import pandas as pd import requests from pandas.compat import StringIO #Sourced from the following site https://github.com/mortada/fredapi from fredapi import Fred fred = Fred(ap
import pandas as pd
import requests
from pandas.compat import StringIO
#Sourced from the following site https://github.com/mortada/fredapi
from fredapi import Fred
fred = Fred(api_key='xxxx')
import StringIO
import datetime
import sys
if sys.version_info[0] < 3:
from StringIO import StringIO as stio
else:
from io import StringIO as stio
year_list = '2013','2014','2015','2016','2017'
month_list = '01','02','03','04','05','06','07','08','09','10','11','12'
#############################################
#Get the total exports from the United States
#############################################
exports = pd.DataFrame()
for i in year_list:
for s in month_list:
try:
link="https://api.census.gov/data/timeseries/intltrade/exports/hs?get=CTY_CODE,CTY_NAME,ALL_VAL_MO,ALL_VAL_YR&time="
str1 = ''.join([i])
txt = '-'
str2 = ''.join([s])
total_link=link+str1+txt+str2
r = requests.get(total_link, headers = {'User-agent': 'your bot 0.1'})
df = pd.read_csv(StringIO(r.text))
##################### change starts here #####################
##################### since it is a dataframe itself, so the method to create a dataframe from a list won't work ########################
# Drop the total sales line
df.drop(df.index[0])
# Rename Column name
df.columns=['CTY_CODE','CTY_NAME','EXPORT MTH','EXPORT YR','time','UN']
# Change the ["1234" to 1234
df['CTY_CODE']=df['CTY_CODE'].str[2:-1]
# Change the 2017-01] to 2017-01
df['time']=df['time'].str[:-1]
##################### change ends here #####################
exports = exports.append(df, ignore_index=False)
except:
print i
print s
将熊猫作为pd导入
导入请求
从pandas.compat导入StringIO
#来源于以下网站https://github.com/mortada/fredapi
从弗雷德那里
fred=fred(api_key='xxxx')
导入StringIO
导入日期时间
导入系统
如果系统版本信息[0]<3:
从StringIO将StringIO导入为stio
其他:
从io导入StringIO作为stio
年度清单=‘2013’、‘2014’、‘2015’、‘2016’、‘2017’
月份列表='01','02','03','04','05','06','07','08','09','10','11','12'
#############################################
#获取美国的总出口量
#############################################
exports=pd.DataFrame()
对于第i年的清单:
对于月份列表中的s:
尝试:
链接=”https://api.census.gov/data/timeseries/intltrade/exports/hs?get=CTY_CODE,CTY_NAME,ALL_VAL_MO,ALL_VAL_YR&time=”
str1=''.join([i])
txt='-'
str2=''.join([s])
总链接=链接+str1+txt+str2
r=requests.get(total_链接,headers={'User-agent':'your bot 0.1'})
df=pd.read\u csv(StringIO(r.text))
#####################改变从这里开始#####################
#####################因为它本身就是一个数据帧,所以从列表创建数据帧的方法将无法工作########################
#删除总销售线
df.drop(df.index[0])
#重命名列名
df.columns=['CTY_CODE'、'CTY_NAME'、'EXPORT MTH'、'EXPORT YR'、'time'、'UN']
#将[“1234”更改为1234
df['CTY_CODE']=df['CTY_CODE'].str[2:-1]
#将2017-01]更改为2017-01
df['time']=df['time'].str[:-1]
#####################在这里换头
exports=exports.append(df,ignore_index=False)
除:
打印i
印刷品
给你:
import ast
import itertools
import pandas as pd
import requests
base = "https://api.census.gov/data/timeseries/intltrade/exports/hs?get=CTY_CODE,CTY_NAME,ALL_VAL_MO,ALL_VAL_YR&time="
year_list = ['2013','2014','2015','2016','2017']
month_list = ['01','02','03','04','05','06','07','08','09','10','11','12']
exports = []
rejects = []
for year, month in itertools.product(year_list, month_list):
url = '%s%s-%s' % (base, year, month)
r = requests.get(url, headers={'User-agent': 'your bot 0.1'})
if r.text:
r = ast.literal_eval(r.text)
df = pd.DataFrame(r[2:], columns=r[0])
exports.append(df)
else:
rejects.append((int(year), int(month)))
exports = pd.concat(exports).reset_index().drop('index', axis=1)
您的结果如下所示:
CTY_CODE CTY_NAME ALL_VAL_MO ALL_VAL_YR time
0 1010 GREENLAND 233446 233446 2013-01
1 1220 CANADA 23170845914 23170845914 2013-01
2 2010 MEXICO 17902453702 17902453702 2013-01
3 2050 GUATEMALA 425978783 425978783 2013-01
4 2080 BELIZE 17795867 17795867 2013-01
5 2110 EL SALVADOR 207606613 207606613 2013-01
6 2150 HONDURAS 429806151 429806151 2013-01
7 2190 NICARAGUA 75752432 75752432 2013-01
8 2230 COSTA RICA 598484187 598484187 2013-01
9 2250 PANAMA 1046236431 1046236431 2013-01
10 2320 BERMUDA 47156737 47156737 2013-01
11 2360 BAHAMAS 256292297 256292297 2013-01
... ... ... ... ...
13883 0024 LAFTA 27790655209 193139639307 2017-07
13884 0025 EURO AREA 15994685459 121039479852 2017-07
13885 0026 APEC 76654291110 550552655105 2017-07
13886 0027 ASEAN 6030380132 44558200533 2017-07
13887 0028 CACM 2133048149 13333440411 2017-07
13888 1XXX NORTH AMERICA 41622877949 299981278306 2017-07
13889 2XXX CENTRAL AMERICA 4697852283 30756310800 2017-07
13890 3XXX SOUTH AMERICA 8117215081 55039567414 2017-07
13891 4XXX EUROPE 25201247938 189925038230 2017-07
13892 5XXX ASIA 38329181070 274304503490 2017-07
13893 6XXX AUSTRALIA AND OC... 2389798925 16656777753 2017-07
13894 7XXX AFRICA 1809443365 13022520158 2017-07
演练:
迭代(年、月)组合的产品,将它们与您的itertools.product
url连接起来base
- 如果响应对象的文本不是空的(2017-12等期间将为空),则从字面上评估的文本中创建一个数据框,这是一个列表列表。将第一个元素用作列,忽略第二个元素
- 否则,将(年、月)组合添加到
,这是未找到项的元组列表拒绝
- 我使用了
,因为连接数据帧列表比附加到现有数据帧更有效exports=[]