Python错误:int()的文本无效
我是python新手。我正在写一个脚本,从网站中提取一些数据并绘制它。但是,我的代码出错了,说数据类型不正确。具体来说,我有“value”的十进制值和“year”的日期。我试图重新定义它们,但我认为我把定义放错地方了。任何帮助将不胜感激,代码如下Python错误:int()的文本无效,python,Python,我是python新手。我正在写一个脚本,从网站中提取一些数据并绘制它。但是,我的代码出错了,说数据类型不正确。具体来说,我有“value”的十进制值和“year”的日期。我试图重新定义它们,但我认为我把定义放错地方了。任何帮助将不胜感激,代码如下 import numpy as np import pandas as pd import json import matplotlib.pyplot as mp from IPython.display import HTML import getp
import numpy as np
import pandas as pd
import json
import matplotlib.pyplot as mp
from IPython.display import HTML
import getpass
import requests
def frame(url, height=400, width=100):
display_string = '<frame src={url} width={w} height={h}>
</iframe>'.format(url=url, w=width, h=height)
return HTML(display_string)
frame('https://data.bls.gov/registrationEngine/')
registration_key = getpass.getpass('Enter Registration Key: ')
series = 'MPU4900012'
frame('https://api.bls.gov/publicAPI/v1/timeseries/data/')
def capture_series(series, start, end, key=registration_key):
url = 'https://api.bls.gov/publicAPI/v2/timeseries/data/'
url += '?registrationkey={key}'.format(key=key)
data = json.dumps({
"seriesid": [series],
"startyear": str(start),
"endyear": str(end)
})
headers = {
"Content-type": "application/json"
}
result = requests.post(url, data=data, headers=headers)
return json.loads(result.text)
json_data = capture_series(series, 1987, 2016)
json_data
df_data = pd.DataFrame(json_data['Results']['series'][0]['data'])
print(df_data)
df_sub = df_data[['value', 'year']].astype(float).astype(int)
df_sub.set_index('year', inplace=True)
df_sub.sort_index(inplace=True)
df_sub
x = df_sub.index
y = df_sub['value']
mp.plot(x,y)
mp.title('Major Sector Multifactor Productivity')
mp.xlabel('years')
mp.ylabel('values')
mp.show
错误日志显示了这一点(使用Jupyter w/Python 3作为参考)
ValueError回溯(最近一次调用)
在()
41打印(df_数据)
42
--->43 df_sub=df_数据[['值','年]].astype(int)
44 df_子集_指数('年',原地=真)
45 df_子排序_索引(就地=真)
...
ValueError:基数为10的int()的文本无效:“86.244”
好的,我对你的例子进行了模拟
我认为value
列是str
类型。这意味着您需要先使用.astype(float)
在这里:
你能发布错误日志吗?
打印什么(df_数据[['value','year']])
显示什么?ValueError表示你试图将'86.244'转换为整数。转换应该是浮动的。是的,DataFrame
看起来像是从JSON
构建的,因为您正在使用请求。JSON
中的十进制数始终是str
Hi@James Schinner,你是对的,我没有意识到你可以在数据帧中附加多个astype(),并添加.astype(float).astype(int)有效。我已经编辑了问题中的代码以显示更改。谢谢你的帮助!
footnotes period periodName value year
0 [{}] A01 Annual 86.244 1996
1 [{}] A01 Annual 84.713 1995
2 [{}] A01 Annual 85.141 1994
3 [{}] A01 Annual 84.688 1993
4 [{}] A01 Annual 85.037 1992
5 [{}] A01 Annual 82.280 1991
6 [{}] A01 Annual 82.625 1990
7 [{}] A01 Annual 81.965 1989
8 [{}] A01 Annual 81.587 1988
9 [{}] A01 Annual 80.816 1987
ValueError Traceback (most recent call last)
<ipython-input-101-8ee6d83ca777> in <module>()
41 print(df_data)
42
---> 43 df_sub = df_data[['value', 'year']].astype(int)
44 df_sub.set_index('year', inplace=True)
45 df_sub.sort_index(inplace=True)
...
ValueError: invalid literal for int() with base 10: '86.244'
>>> data = {'value': {0: '84.713', 1: '85.141', 2: '84.688', 3: '85.037',
4: '82.280', 5: '82.625', 6: '81.965', 7: '81.587', 8: '80.816'},
'year': {0: '1995', 1: '1994', 2: '1993', 3: '1992', 4: '1991',
5: '1990', 6: '1989', 7: '1988', 8: '1987'}}
>>> df = pd.DataFrame(data)
>>> df
value year
0 84.713 1995
1 85.141 1994
2 84.688 1993
3 85.037 1992
4 82.280 1991
5 82.625 1990
6 81.965 1989
7 81.587 1988
8 80.816 1987
>>> df['value'].astype(int) # <- replicating eror
Traceback (most recent call last):
ValueError: invalid literal for int() with base 10: '84.713'
>>> df['value'].astype(float).astype(int) # <= HERE
0 84
1 85
2 84
3 85
4 82
5 82
6 81
7 81
8 80
Name: value, dtype: int32
df[['value', 'year']].astype(float).astype(int)