Python 熊猫减去两个列相同的数据框中的值,创建新的数据框以存储结果
我试图创建一个新的数据帧Python 熊猫减去两个列相同的数据框中的值,创建新的数据框以存储结果,python,pandas,dataframe,Python,Pandas,Dataframe,我试图创建一个新的数据帧new_df,其中一个新列包含减去两个单独数据帧中相同列的值差:df1df2 我尝试使用代码new_df.loc['difference']=df1.loc['s_values']-df2.loc['s_values'] 但我无法达到我的目标 其中df1= stats s_values gender year women 2007 height 40 2007
new_df
,其中一个新列包含减去两个单独数据帧中相同列的值差:df1
df2
我尝试使用代码new_df.loc['difference']=df1.loc['s_values']-df2.loc['s_values']
但我无法达到我的目标
其中df1=
stats s_values
gender year
women 2007 height 40
2007 cigarette use 31
和df2=
stats s_values
gender year
Men 2007 height 10
2007 cigarette use 11
达到预期输出(我不想包括性别
索引)
new\u df=
stats difference
year
2007 height 30
2007 cigarette use 20
您可以尝试以下方法(完整示例):
输入:
import pandas as pd
df1 = pd.DataFrame({'gender': {0: 'woman', 1: 'woman'},
'year': {0: 2007, 1: 2007},
'stats': {0: 'height', 1: 'cigarette use'},
's_values': {0: 40, 1: 31}})
df2 = pd.DataFrame({'gender': {0: 'men', 1: 'men'},
'year': {0: 2007, 1: 2007},
'stats': {0: 'height', 1: 'cigarette use'},
's_values': {0: 10, 1: 11}})
df = pd.concat([df1,df2], ignore_index=True)
df['s_values'] = df.groupby(['year', 'stats'])['s_values'].diff().abs()
df.dropna(subset=['s_values']).drop('gender', axis=1)
year stats s_values
2 2007 height 30.0
3 2007 cigarette use 20.0
代码:
import pandas as pd
df1 = pd.DataFrame({'gender': {0: 'woman', 1: 'woman'},
'year': {0: 2007, 1: 2007},
'stats': {0: 'height', 1: 'cigarette use'},
's_values': {0: 40, 1: 31}})
df2 = pd.DataFrame({'gender': {0: 'men', 1: 'men'},
'year': {0: 2007, 1: 2007},
'stats': {0: 'height', 1: 'cigarette use'},
's_values': {0: 10, 1: 11}})
df = pd.concat([df1,df2], ignore_index=True)
df['s_values'] = df.groupby(['year', 'stats'])['s_values'].diff().abs()
df.dropna(subset=['s_values']).drop('gender', axis=1)
year stats s_values
2 2007 height 30.0
3 2007 cigarette use 20.0
输出:
import pandas as pd
df1 = pd.DataFrame({'gender': {0: 'woman', 1: 'woman'},
'year': {0: 2007, 1: 2007},
'stats': {0: 'height', 1: 'cigarette use'},
's_values': {0: 40, 1: 31}})
df2 = pd.DataFrame({'gender': {0: 'men', 1: 'men'},
'year': {0: 2007, 1: 2007},
'stats': {0: 'height', 1: 'cigarette use'},
's_values': {0: 10, 1: 11}})
df = pd.concat([df1,df2], ignore_index=True)
df['s_values'] = df.groupby(['year', 'stats'])['s_values'].diff().abs()
df.dropna(subset=['s_values']).drop('gender', axis=1)
year stats s_values
2 2007 height 30.0
3 2007 cigarette use 20.0
注意:
import pandas as pd
df1 = pd.DataFrame({'gender': {0: 'woman', 1: 'woman'},
'year': {0: 2007, 1: 2007},
'stats': {0: 'height', 1: 'cigarette use'},
's_values': {0: 40, 1: 31}})
df2 = pd.DataFrame({'gender': {0: 'men', 1: 'men'},
'year': {0: 2007, 1: 2007},
'stats': {0: 'height', 1: 'cigarette use'},
's_values': {0: 10, 1: 11}})
df = pd.concat([df1,df2], ignore_index=True)
df['s_values'] = df.groupby(['year', 'stats'])['s_values'].diff().abs()
df.dropna(subset=['s_values']).drop('gender', axis=1)
year stats s_values
2 2007 height 30.0
3 2007 cigarette use 20.0
如果两个数据帧的结构完全相同,则其长度甚至更短:
df1.drop('gender', axis=1).assign(s_values=df1['s_values'] - df2['s_values'])
您好,当我尝试new_df[“year”]=df1[“year”]时,我遇到了一个关键错误,我认为这是因为“year”是df1中的一个索引,而不是一列?@Morello能否请您以允许复制它们的形式提供
df1
和df2
?