Python 如何用计算的CAGR值替换NaN列
我有一个带有NaN值的数据帧。我想将NaN值替换为CAGR值Python 如何用计算的CAGR值替换NaN列,python,numpy,dataframe,replace,calculated-columns,Python,Numpy,Dataframe,Replace,Calculated Columns,我有一个带有NaN值的数据帧。我想将NaN值替换为CAGR值 val1 val2 val3 val4 val5 0 100 100 100 100 100 1 90 110 80 110 50 2 70 150 70 NaN NaN 3 NaN NaN NaN NaN NaN 复合年增长率 =(终值/首值)**(1/年数) 例如,val1的复合年增长率为-23%。所以val1的最后一个值是53.9 列v
val1 val2 val3 val4 val5
0 100 100 100 100 100
1 90 110 80 110 50
2 70 150 70 NaN NaN
3 NaN NaN NaN NaN NaN
复合年增长率
=(终值/首值)**(1/年数)
例如,val1的复合年增长率为-23%。所以val1的最后一个值是53.9
列val4的复合年增长率值为10%
因此,第2行NaN将为121,第3行NaN将替换为133
如何自动替换NaN
问题是
1) 如何计算每列的复合年增长率
我使用了isnull(),所以我找到了哪一行是空的。但是我不知道如何计算CAGR
2) 如何用计算值替换NaN
多谢各位
from __future__ import division # for python2.7
import numpy as np
# tab delimited data
a = '''100 100 100 100 100
90 110 80 110 50
70 150 70 NaN NaN
NaN NaN NaN NaN NaN
'''
# parse and make a numpy array
data = np.array( [[np.nan if aaa=='NaN' else int(aaa) for aaa in aa.split('\t')] for aa in a.splitlines()] )
for col in range(5):
Nyears = np.isnan(data[:,col]).argmax()-1 # row index for the last non-NaN value
endvalue = data[Nyears,col]
cagr = (endvalue / 100) ** (1 / Nyears)
print Nyears, endvalue, cagr
for year in np.argwhere(np.isnan(data[:,col])):
data[year,col] = data[year-1,col] * cagr
print data
我得到:
[[ 100. 100. 100. 100. 100. ]
[ 90. 110. 80. 110. 50. ]
[ 70. 150. 70. 121. 25. ]
[ 58.56620186 183.71173071 58.56620186 133.1 12.5 ]]