Python 如何比较来自不同数据帧的列
我有这两个数据帧Python 如何比较来自不同数据帧的列,python,pandas,Python,Pandas,我有这两个数据帧 {'Category': {0: 'BASE2_TREE_FILTER vs RETAIL 100', 1: 'LR_TREE_FILTER vs RETAIL 100'}, 'Mean': {0: 4.859101849501094, 1: 3.349513603073975}, 'Absolute Mean': {0: 6.917727336706257, 1: 5.352618468237218}, 'Increase': {0: 13, 1: 13}, '%change
{'Category': {0: 'BASE2_TREE_FILTER vs RETAIL 100', 1: 'LR_TREE_FILTER vs RETAIL 100'}, 'Mean': {0: 4.859101849501094, 1: 3.349513603073975}, 'Absolute Mean': {0: 6.917727336706257, 1: 5.352618468237218}, 'Increase': {0: 13, 1: 13}, '%change(Increase)': {0: 9.059099374005655, 1: 6.693947747162456}, 'Decrease': {0: 7, 1: 7}, '%change(Decrease)': {0: -2.940893553150234, 1: -2.861578378804634}, 'unchanged': {0: 0, 1: 0}}
第二个:
{'Category': {0: 'BASE2_TREE_FILTER vs RETAIL 100', 1: 'LR_TREE_FILTER vs RETAIL 100'}, 'Mean': {0: 4.947988913441173, 1: 4.494044038470856}, 'Absolute Mean': {0: 6.972378375288884, 1: 6.366948207708872}, 'Increase': {0: 26, 1: 26}, '%change(Increase)': {0: 8.252561969120809, 1: 7.519148478124428}, 'Decrease': {0: 9, 1: 9}, '%change(Decrease)': {0: -4.04877892369542, 1: -3.745808338476033}, 'unchanged': {0: 1, 1: 1}}
我需要比较两者的绝对平均值,无论哪个数据帧的绝对平均值较低,然后返回那个值。我该怎么做
数据帧1:
数据帧2:
编辑:
行数将来可能会有所不同,因此我正在寻找一种通用解决方案。您可以使用
np.where
,其中的条件是要知道哪个数据帧的平均值较小
例如,解决方案可以是:
data1={'Category':{0:'BASE2_TREE_FILTER vs RETAIL 100',1:'LR_TREE_FILTER vs RETAIL 100','Mean':{0:4.859101849501094,1:3.349513603073975},'绝对平均数':{0:6.917727336706257,1:5.352618468237218},'增加':{0:13,1:13},%变化(增加){0:9.059099374005655,1:6.69747,'(减少)“{0:-2.940893553150234,1:-2.861578378804634},'不变':{0:0,1:0}”
df1=pd.DataFrame(数据1)
data2={'Category':{0:'BASE2_TREE_FILTER vs RETAIL 100',1:'LR_TREE_FILTER vs RETAIL 100','Mean':{0:4.947988913441173,1:4.494044038470856},'Absolute Mean':{0:6.9723783752884,1:6.366948207087708872},'增加',{0:26,1:26},%变化(增加):{0:8.252569120809,1:7.48819,'(减少)“{0:-4.04877892369542,1:-3.745808338476033},'不变':{0:1,1:1}”
df2=pd.DataFrame(数据2)
result=pd.DataFrame()
结果['Category']=df1['Category']
result['Data from']=np.其中(df1['Absolute Mean']
最小绝对平均值的类别数据
0基本2树过滤器与零售100 df1 6.917727
1 LR_树_过滤器与零售100 df1 5.352618
灵感来自anwser:
将熊猫作为pd导入
df1=pd.DataFrame({'Category':{0:'BASE2_TREE_FILTER vs RETAIL 100',1:'LR_TREE_FILTER vs RETAIL 100'},
'平均数':{0:4.859101849501094,1:3.349513603073975},
‘绝对平均数’:{0:6.917727336706257,1:5.352618468237218},
'增加':{0:13,1:13},
“%change(Increase)”{0:9.059099374005655,1:6.693947747162456},
'减少':{0:7,1:7},
“%change(reduce)”{0:-2.940893553150234,1:-2.861578378804634},
“未更改”:(0:0,1:0})
df2=pd.DataFrame({'Category':{0:'BASE2_TREE_FILTER vs RETAIL 100',1:'LR_TREE_FILTER vs RETAIL 100'},
'平均数':{0:4.947988913441173,1:4.494044038470856},
‘绝对平均数’:{0:6.972378375288884,1:6.366948207708872},
'增加':{0:26,1:26},
“%change(Increase)”{0:8.252561969120809,1:7.519148478124428},
'减少':{0:9,1:9},
“%change(reduce)”{0:-4.04877892369542,1:-3.745808338476033},
“未更改”:(0:1,1:1})
df3=df1。其中(df1[‘绝对平均值’]
数据框在哪里?这些是字典。这些是数据框。将它们放在这样的位置以使其可复制。您是在寻找基于绝对平均值整列的比较还是基于每行的比较?整列@GilPinsky
Category Mean Absolute Mean Increase %change(Increase) Decrease %change(Decrease) unchanged
0 BASE2_TREE_FILTER vs RETAIL 100 4.859102 6.917727 13 9.059099 7 -2.940894 0
1 LR_TREE_FILTER vs RETAIL 100 3.349514 5.352618 13 6.693948 7 -2.861578 0
Category Mean Absolute Mean Increase %change(Increase) Decrease %change(Decrease) unchanged
0 BASE2_TREE_FILTER vs RETAIL 100 4.947989 6.972378 26 8.252562 9 -4.048779 1
1 LR_TREE_FILTER vs RETAIL 100 4.494044 6.366948 26 7.519148 9 -3.745808 1