Python 平均来自特定列的数据帧_Python_Pandas_Dataframe_Mean

Python 平均来自特定列的数据帧

python pandas dataframe

Python 平均来自特定列的数据帧,python,pandas,dataframe,mean,Python,Pandas,Dataframe,Mean,我对熊猫不熟悉。我有几个dfs。列0中的数据是ID，列1-10中的数据是概率。我想取dfs中1-10列的列平均值。行的顺序可能不同有没有比在ID上对每个df排序，然后使用add/divide df函数更好的方法？谢谢你的帮助非常感谢您的评论。为了澄清，我需要对两个dfs元素进行平均。即（仅显示每个df的1行）：它看起来像是需要和：我不确定什么是需要，因此我添加了两个： #mean per rows print (pd.concat(dfs, ignore_index=True).ix[:

我对熊猫不熟悉。我有几个

dfs

。列

中的数据是

ID

，列

1-10

中的数据是概率。我想取

dfs

中

1-10列的列平均值。行的顺序可能不同
有没有比在ID
上对每个df排序，然后使用add/divide df函数更好的方法？谢谢你的帮助
非常感谢您的评论。为了澄清，我需要对两个dfs元素进行平均。即（仅显示每个df的1行）：
它看起来像是需要和：
我不确定什么是需要，因此我添加了两个：
#mean per rows
print (pd.concat(dfs, ignore_index=True).ix[:,1:].mean(1))
0    1.000000
1    1.666667
2    1.333333
3    1.000000
4    0.666667
5    0.333333
dtype: float64

#mean per columns
print (pd.concat(dfs, ignore_index=True).ix[:,1:].mean())
1    1.333333
2    0.666667
3    1.000000
dtype: float64


也许你还需要别的东西：
dfs = [df1.set_index(0), df2.set_index(0)]
print (pd.concat(dfs, ignore_index=True, axis=1))
       0  1  2  3  4  5
0                      
14254  1  1  1  1  1  1
25445  2  1  2  0  0  2
34555  3  1  0  1  0  0

print (pd.concat(dfs, ignore_index=True, axis=1).mean(1))
0
14254    1.000000
25445    1.166667
34555    0.833333
dtype: float64

print (pd.concat(dfs, ignore_index=True, axis=1).mean())
0    2.000000
1    1.000000
2    1.000000
3    0.666667
4    0.333333
5    1.000000
dtype: float64

Pandas对许多操作（添加、划分等）使用索引。如果将ID设置为索引，则不需要排序。
#list of all DataFrames
dfs = [df1, df2]
print (pd.concat(dfs, ignore_index=True))
       0  1  2  3
0  14254  1  1  1
1  25445  2  1  2
2  34555  3  1  0
3  14254  1  1  1
4  25445  0  0  2
5  34555  1  0  0

#select all columns without first
print (pd.concat(dfs, ignore_index=True).ix[:,1:])
   1  2  3
0  1  1  1
1  2  1  2
2  3  1  0
3  1  1  1
4  0  0  2
5  1  0  0

#mean per rows
print (pd.concat(dfs, ignore_index=True).ix[:,1:].mean(1))
0    1.000000
1    1.666667
2    1.333333
3    1.000000
4    0.666667
5    0.333333
dtype: float64

#mean per columns
print (pd.concat(dfs, ignore_index=True).ix[:,1:].mean())
1    1.333333
2    0.666667
3    1.000000
dtype: float64

dfs = [df1.set_index(0), df2.set_index(0)]
print (pd.concat(dfs, ignore_index=True, axis=1))
       0  1  2  3  4  5
0                      
14254  1  1  1  1  1  1
25445  2  1  2  0  0  2
34555  3  1  0  1  0  0

print (pd.concat(dfs, ignore_index=True, axis=1).mean(1))
0
14254    1.000000
25445    1.166667
34555    0.833333
dtype: float64

print (pd.concat(dfs, ignore_index=True, axis=1).mean())
0    2.000000
1    1.000000
2    1.000000
3    0.666667
4    0.333333
5    1.000000
dtype: float64