Python 使用基准年计算指数
df 我如何计算每种水果的真实指数后的年度指数列?基准年由索引_值==100的行给出 我试过:Python 使用基准年计算指数,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,df 我如何计算每种水果的真实指数后的年度指数列?基准年由索引_值==100的行给出 我试过: fruit year price index_value Boolean index apple 1960 11 apple 1961 12 100 True apple 1962 13 apple 1963 13 100 True banana 1960 11 banana 1961 12 预期产出:
fruit year price index_value Boolean index
apple 1960 11
apple 1961 12 100 True
apple 1962 13
apple 1963 13 100 True
banana 1960 11
banana 1961 12
预期产出:
df['index'] = df.groupby('fruit')['price'].apply(lambda x: (x/x.iloc[0] * 100).round(0))
我冒昧地为apple 1964 11调整了一行输入数据,以匹配您的输出示例。列
Boolean
是冗余的
fruit year price index_value Boolean index
apple 1960 11
apple 1961 12 100 True 100
apple 1962 13 108
apple 1963 13 100 True 100
apple 1964 11 84
banana 1961 12
输出:
要获得所需的输出,首先为给定索引值之后的值创建子组
fruit year price index_value
0 apple 1960 11 NaN
1 apple 1961 12 100.0
2 apple 1962 13 NaN
3 apple 1963 13 100.0
4 apple 1964 11 NaN
5 banana 1960 11 NaN
6 banana 1961 12 NaN
输出:
然后可以计算索引值的百分比变化
fruit year price index_value groups
0 apple 1960 11 NaN 0
1 apple 1961 12 100.0 1
2 apple 1962 13 NaN 1
3 apple 1963 13 100.0 2
4 apple 1964 11 NaN 2
5 banana 1960 11 NaN 0
6 banana 1961 12 NaN 0
输出:
df['groups'] = df.index_value.notna().groupby(df.fruit).cumsum().astype('int')
print(df)
fruit year price index_value groups
0 apple 1960 11 NaN 0
1 apple 1961 12 100.0 1
2 apple 1962 13 NaN 1
3 apple 1963 13 100.0 2
4 apple 1964 11 NaN 2
5 banana 1960 11 NaN 0
6 banana 1961 12 NaN 0
df['index_change'] = (
df[df.groups.ne(0)]
.groupby(['fruit','groups'])['price'].apply(lambda x: np.floor((x/x.iloc[0] * 100)))
)
print(df)
fruit year price index_value groups index_change
0 apple 1960 11 NaN 0 NaN
1 apple 1961 12 100.0 1 100.0
2 apple 1962 13 NaN 1 108.0
3 apple 1963 13 100.0 2 100.0
4 apple 1964 11 NaN 2 84.0
5 banana 1960 11 NaN 0 NaN
6 banana 1961 12 NaN 0 NaN