Python 按两列映射DataFrame上的值
假设我试图将总案例映射到每个案例,其中数据帧是:Python 按两列映射DataFrame上的值,python,python-2.7,pandas,Python,Python 2.7,Pandas,假设我试图将总案例映射到每个案例,其中数据帧是: results_grouped_age = results_grouped[['Make', 'age', 'Test Result', 'Number of Cases']].copy() cases_by_age = results_grouped_age[['Make','age','Test Result','Number of Cases']].groupby(['Make','age','Test Result']).sum
results_grouped_age = results_grouped[['Make', 'age', 'Test Result', 'Number of Cases']].copy()
cases_by_age = results_grouped_age[['Make','age','Test Result','Number of Cases']].groupby(['Make','age','Test Result']).sum().reset_index()
total_cases_age = cases_by_age.groupby(['Make','age'])['Number of Cases'].sum()
然而,鉴于我通常会:
cases_by_age['Total Cases'] = cases_by_age['age'].map(total_cases_age)
总病例数和年龄指数实际上是“品牌和年龄”的组合,这就是我想要做的。为了更容易理解我的问题,假设我有表案例(按年龄)
最终结果应该是这样的:
Make age Test Result Number of Cases Total Cases by Age
0 ALFA ROMEO 0-3 ABA 1 335
1 ALFA ROMEO 0-3 ABR NaN 335
2 ALFA ROMEO 0-3 F 45 335
3 ALFA ROMEO 0-3 P 268 335
4 ALFA ROMEO 0-3 PRS 21 335
5 ALFA ROMEO 3-5 ABA NaN 879
6 ALFA ROMEO 3-5 ABR NaN 879
7 ALFA ROMEO 3-5 F 159 879
8 ALFA ROMEO 3-5 P 720 879
品牌和年代如此
任何帮助都将不胜感激您可以按
分组
-求和
,然后进行左-合并
:
pd.merge(
df,
df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
columns={'Number of Cases': 'Total Cases by Age'}),
how='left')
示例 假设你从
df = pd.DataFrame({
'Make': ['ALPHA ROMEO'] * 3,
'age': ['0-3', '0-3', '3-5'],
'Number of Cases': [1, 10, 2]
})
>>> df
Make Number of Cases age
0 ALPHA ROMEO 1 0-3
1 ALPHA ROMEO 10 0-3
2 ALPHA ROMEO 2 3-5
然后,groupby
-sum
给出:
>>> df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
columns={'Number of Cases': 'Total Cases by Age'})
age Total Cases by Age
0 0-3 11
1 3-5 2
这一组合给出:
>>> pd.merge(
df,
df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
columns={'Number of Cases': 'Total Cases by Age'}),
how='left')
Make Number of Cases age Total Cases by Age
0 ALPHA ROMEO 1 0-3 11
1 ALPHA ROMEO 10 0-3 11
2 ALPHA ROMEO 2 3-5 2
感谢您的回答,但是这里的想法是按车辆的品牌和年龄求和,因此在您的示例中,假设groupby sum必须找到具有特定年龄的所有车辆的总和,并在具有此品牌和年龄的所有行旁边生成并映射此值(无论测试结果如何)事实上,没关系,我试着用这两种方法分组,结果没问题。谢谢!
>>> pd.merge(
df,
df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
columns={'Number of Cases': 'Total Cases by Age'}),
how='left')
Make Number of Cases age Total Cases by Age
0 ALPHA ROMEO 1 0-3 11
1 ALPHA ROMEO 10 0-3 11
2 ALPHA ROMEO 2 3-5 2