Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/.htaccess/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何计算每个标题中有多少男性/女性?_Python_Pandas_Dataframe_Pandas Groupby - Fatal编程技术网

Python 如何计算每个标题中有多少男性/女性?

Python 如何计算每个标题中有多少男性/女性?,python,pandas,dataframe,pandas-groupby,Python,Pandas,Dataframe,Pandas Groupby,我是数据科学的新手,我想计算一下每个标题中有多少女性/男性 我尝试了以下代码: ''' 我得到的是: Title Age Sex 0 Mr 22.0 male 1 Mrs 38.0 female 2 Miss 26.0 female 3 Mrs 35.0 female 4 Mr 35.0 male 然后我尝试添加#男#女列 df = pd.DataFrame() df = newdf[['Age','Title']].group

我是数据科学的新手,我想计算一下每个标题中有多少女性/男性

我尝试了以下代码:

'''

我得到的是:

  Title   Age     Sex
0    Mr  22.0    male
1   Mrs  38.0  female
2  Miss  26.0  female
3   Mrs  35.0  female
4    Mr  35.0    male
然后我尝试添加#男#女列

df = pd.DataFrame()
df = newdf[['Age','Title']].groupby('Title').mean().sort_values(by='Age',ascending=False)
df['#People'] = newdf['Title'].value_counts()
df['Male'] = newdf['Title'].sum(newdf['Sex']=='male')
df['Female'] = newdf['Title'].sum(newdf['Sex']=='female')
我收到的错误消息: TypeError:“Series”对象是可变的,因此不能对其进行哈希运算

我希望有四个栏目:标题、年龄(平均值)、人物、男性、女性。所以我想知道这些人中有多少是男性和女性

p.S.没有这些行:

一切正常,我得到:

但不包括#男#女。

用于聚合
mean
size
以及新列,添加方式为:


注意:这一行是无用的:
df=pd.DataFrame()
这:
newdf[['Age','Title']]。groupby('Title')
应该重写为:
newdf.groupby('Title')['Age']
没有得到一分。为什么df.groupby('Title')['Age']中的['Age']是什么?@Doctror很奇怪,因为它是相同的
df=newdf['Age','Title']].groupby('Title').mean()
-vs
newdf.groupby('Title')['Age'].mean()。
df = pd.DataFrame()
df = newdf[['Age','Title']].groupby('Title').mean().sort_values(by='Age',ascending=False)
df['#People'] = newdf['Title'].value_counts()
df['Male'] = newdf['Title'].sum(newdf['Sex']=='male')
df['Female'] = newdf['Title'].sum(newdf['Sex']=='female')
df['Male'] = newdf['Title'].sum(newdf['Sex']=='male')
df['Female'] = newdf['Title'].sum(newdf['Sex']=='female')
    Age #People
Title       
Capt    70.000000   1
Col     54.000000   4
Sir     49.000000   1
Major   48.500000   2
Lady    48.000000   1
Dr      43.571429   7
....
df1 = (df.groupby('Title')['Age']
         .agg([('Age','mean'),('#People','size')])
         .sort_values(by='Age',ascending=False))

df2 = pd.crosstab(df['Title'], df['Sex']).add_suffix('_avg')

df = df1.join(df2)
print (df)
        Age  #People  female_avg  male_avg
Title                                     
Mrs    36.5        2           2         0
Mr     28.5        2           0         2
Miss   26.0        1           1         0