在Python中的每个组末尾插入新行(带有日期时间索引)
我对Python非常陌生,现在我有一个如下表:在Python中的每个组末尾插入新行(带有日期时间索引),python,pandas,Python,Pandas,我对Python非常陌生,现在我有一个如下表: **YearMonth** language Rate 2018-01 en 0.093 2018-02 en 0.084 2018-03 en 0.088 ... 2018-12 en 0.079 2019-01 en 0.088
**YearMonth** language Rate
2018-01 en 0.093
2018-02 en 0.084
2018-03 en 0.088
...
2018-12 en 0.079
2019-01 en 0.088
2018-01 fr 0.094
2018-02 fr 0.078
2018-03 fr 0.087
...
2018-12 fr 0.084
2019-01 fr 0.079
现在,我想根据条件在每种语言的末尾插入一些行:
e、 g
2019-02,en,某些值(如果2018-02>0.9,则为平均值(前3个月的值/3),否则为平均值(前3个月的值/4))
2019-02,fr,某些值(如果2018-02>0.9,则为平均值(前3个月的值/4),否则为平均值(前3个月的值/5))
我如何处理这个问题?谢谢 如果需要,使用最后3行组的平均值:
#values for division by language
dTrue = {'en':3, 'fr':4}
dFalse = {'en':4, 'fr':5}
#get mean of 3 last value of group
s = df.groupby('language')['Rate'].apply(lambda x: x[-3:].mean())
print (s)
language
en 0.085000
fr 0.083333
Name: Rate, dtype: float64
#filter rows by YearMonth and set new YearMonth
df1 = df[df['YearMonth'] == '2018-02'].assign(YearMonth='2019-02')
print (df1)
YearMonth language Rate
1 2019-02 en 0.084
6 2019-02 fr 0.078
#compare Rate and set division number by map of dictionaries
div = np.where(df1['Rate'] > 0.9, df1['language'].map(dTrue), df1['language'].map(dFalse))
print (div)
[4 5]
#division with mapped by Series s
df1['Rate'] = df1['language'].map(s) / div
print (df1)
YearMonth language Rate
1 2019-02 en 0.021250
6 2019-02 fr 0.016667
如果需要,使用最后3行组的平均值:
#values for division by language
dTrue = {'en':3, 'fr':4}
dFalse = {'en':4, 'fr':5}
#get mean of 3 last value of group
s = df.groupby('language')['Rate'].apply(lambda x: x[-3:].mean())
print (s)
language
en 0.085000
fr 0.083333
Name: Rate, dtype: float64
#filter rows by YearMonth and set new YearMonth
df1 = df[df['YearMonth'] == '2018-02'].assign(YearMonth='2019-02')
print (df1)
YearMonth language Rate
1 2019-02 en 0.084
6 2019-02 fr 0.078
#compare Rate and set division number by map of dictionaries
div = np.where(df1['Rate'] > 0.9, df1['language'].map(dTrue), df1['language'].map(dFalse))
print (div)
[4 5]
#division with mapped by Series s
df1['Rate'] = df1['language'].map(s) / div
print (df1)
YearMonth language Rate
1 2019-02 en 0.021250
6 2019-02 fr 0.016667