Python 根据列中的值计算pd.DataFrame()索引的中值

Python 根据列中的值计算pd.DataFrame()索引的中值,python,pandas,median,Python,Pandas,Median,让我们来看一个pd.DataFrame()对象,它存储给定年龄和性别的过去中风的人数。以更直观的方式: positive_by_gender.tail() 给了我们: 性别 女性 男性 年龄 78 9 12 79 13 4 80 10 7 81 8 6 82 4 5 要按照您的想法创建阵列并通过这种方式获取中值: In [235]: df Out[235]: Female Male age 78 9.0 12.0 79 13.0

让我们来看一个pd.DataFrame()对象,它存储给定年龄和性别的过去中风的人数。以更直观的方式:

positive_by_gender.tail()
给了我们:

性别 女性 男性 年龄 78 9 12 79 13 4 80 10 7 81 8 6 82 4 5
要按照您的想法创建阵列并通过这种方式获取中值:

In [235]: df
Out[235]: 
     Female  Male
age              
78      9.0  12.0
79     13.0   4.0
80     10.0   7.0
81      8.0   6.0
82      4.0   5.0

In [236]: df = df.astype(int)

In [237]: df
Out[237]: 
     Female  Male
age              
78        9    12
79       13     4
80       10     7
81        8     6
82        4     5

In [238]: df = df.reset_index('age')

In [240]: df = df.melt(id_vars='age', var_name='gender', value_name='count')

In [241]: df
Out[241]: 
   age  gender  count
0   78  Female      9
1   79  Female     13
2   80  Female     10
3   81  Female      8
4   82  Female      4
5   78    Male     12
6   79    Male      4
7   80    Male      7
8   81    Male      6
9   82    Male      5

In [242]: df['age'] = df.apply(lambda s: [s['age']] * s['count'], axis=1)

In [243]: df
Out[243]: 
                                                 age  gender  count
0               [78, 78, 78, 78, 78, 78, 78, 78, 78]  Female      9
1  [79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 79, 7...  Female     13
2           [80, 80, 80, 80, 80, 80, 80, 80, 80, 80]  Female     10
3                   [81, 81, 81, 81, 81, 81, 81, 81]  Female      8
4                                   [82, 82, 82, 82]  Female      4
5   [78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 78]    Male     12
6                                   [79, 79, 79, 79]    Male      4
7                       [80, 80, 80, 80, 80, 80, 80]    Male      7
8                           [81, 81, 81, 81, 81, 81]    Male      6
9                               [82, 82, 82, 82, 82]    Male      5

In [245]: df = df.explode('age')
In [249]: df['age'] = df['age'].astype(int)

In [251]: df
Out[251]: 
    age  gender  count
0    78  Female      9
0    78  Female      9
0    78  Female      9
0    78  Female      9
0    78  Female      9
..  ...     ...    ...
9    82    Male      5
9    82    Male      5
9    82    Male      5
9    82    Male      5
9    82    Male      5

[78 rows x 3 columns]

In [250]: df.groupby('gender')['age'].median()
Out[250]: 
gender
Female    79.5
Male      80.0
Name: age, dtype: float64

这回答了你的问题吗?只是想看看这是什么,是的,它是有帮助的thx很多,就像下面的完整解决方案