Python 计算groupby对象中的多个值_Python_Pandas_Dataframe

Python 计算groupby对象中的多个值

python pandas dataframe

Python 计算groupby对象中的多个值,python,pandas,dataframe,Python,Pandas,Dataframe,我想计算groupBy对象上的多个值（包含在每个单元格的列表中） # Initialize sample data. df = pd.DataFrame({'Record the respondent’s sex': ['Male', 'Female'] * 2, '7. What do you use the phone for?': [ "sending texts;calls;receiving send

我想计算groupBy对象上的多个值（包含在每个单元格的列表中）

# Initialize sample data.
df = pd.DataFrame({'Record the respondent’s sex': ['Male', 'Female'] * 2, 
                   '7. What do you use the phone for?': [
                       "sending texts;calls;receiving sending texts",
                       "sending texts;calls;WhatsApp;Facebook",
                       "sending texts;calls;receiving texts",
                       "sending texts;calls"
                   ]})

# Split the values on ';' and separate into columns.  Melt the result.
df2 = pd.melt(
    pd.concat([df['Record the respondent’s sex'],
               df.loc[:, "7. What do you use the phone for?"].apply(
                   lambda series: series.split(';')).apply(pd.Series)], axis=1),
    id_vars='Record the respondent’s sex')[['Record the respondent’s sex', 'value']]

# Group on gender and rename columns.
result = df2.groupby('Record the respondent’s sex')['value'].value_counts().reset_index()
result.columns = ['Record the respondent’s sex', '7. What do you use the phone for?', 'count']

# Reorder columns.
>>> result[['7. What do you use the phone for?', 'Record the respondent’s sex', 'count']]
  7. What do you use the phone for? Record the respondent’s sex  count
0                             calls                      Female      2
1                     sending texts                      Female      2
2                          Facebook                      Female      1
3                          WhatsApp                      Female      1
4                             calls                        Male      2
5                     sending texts                        Male      2
6           receiving sending texts                        Male      1
7                   receiving texts                        Male      1

我有以下数据帧：


||记录受访者的性别| 7。你用电话干什么|
|---|-----------------------------|---------------------------------------------|
|0 |男|发短信；电话；收发文本|
|1 |女性|发送短信；电话；WhatsApp脸谱网|
|2 |男|发短信；电话；接收文本|
|3 |女性|发送短信；召唤|

我想计算列

7中的每个值。你用手机干什么？

，在

上分组后，记录受访者的性别
当每个单元格只有一个值时，我可以这样做
grouped=df.groupby（['记录受访者的性别]]，sort=True）
问题_计数=分组['2.您是教师、护理者还是年轻人？'].值_计数（normalize=False，sort=True）
问题\数据=[
{'2.你是教师、护理者还是年轻人？'：问题，“记录受访者的性别”：小组，“计数”：计数*100}
（组，问题），在dict中计数（问题计数）。项（）
df_question=pd.DataFrame（question_数据）

给了我一张完全像这样的桌子：

| 7. 你用手机干什么记录受访者的性别|计数|
|-----------------------------------|-----------------------------|-------|
|发送短信|男| 2|
|电话|男| 2|
|接收文本|男| 2|
|发送短信|女性| 2|
|呼叫|女性| 2|
|WhatsApp |女| 1|
|Facebook |女性| 1|

如果我能让它与多个值一起工作就好了
value\u counts（）
对具有多个值的列表不起作用，它会抛出一个TypeError:unhabable type:'list'
错误。这个问题显示了如何以各种方式处理这个问题，但我似乎无法让它在GroupBy对象上工作。我不知道pd.melt（），而且它似乎做得非常好。谢谢事后看来，只需为每个多个值创建一个额外的行（这是MaxU将其称为重复的方式）更容易，也可能更快。将多个值复制/分解成行似乎是实现这一点的最简单和最快的方法（请参阅），尽管下面被接受的答案表明，也可以不这样做。
# Initialize sample data.
df = pd.DataFrame({'Record the respondent’s sex': ['Male', 'Female'] * 2, 
                   '7. What do you use the phone for?': [
                       "sending texts;calls;receiving sending texts",
                       "sending texts;calls;WhatsApp;Facebook",
                       "sending texts;calls;receiving texts",
                       "sending texts;calls"
                   ]})

# Split the values on ';' and separate into columns.  Melt the result.
df2 = pd.melt(
    pd.concat([df['Record the respondent’s sex'],
               df.loc[:, "7. What do you use the phone for?"].apply(
                   lambda series: series.split(';')).apply(pd.Series)], axis=1),
    id_vars='Record the respondent’s sex')[['Record the respondent’s sex', 'value']]

# Group on gender and rename columns.
result = df2.groupby('Record the respondent’s sex')['value'].value_counts().reset_index()
result.columns = ['Record the respondent’s sex', '7. What do you use the phone for?', 'count']

# Reorder columns.
>>> result[['7. What do you use the phone for?', 'Record the respondent’s sex', 'count']]
  7. What do you use the phone for? Record the respondent’s sex  count
0                             calls                      Female      2
1                     sending texts                      Female      2
2                          Facebook                      Female      1
3                          WhatsApp                      Female      1
4                             calls                        Male      2
5                     sending texts                        Male      2
6           receiving sending texts                        Male      1
7                   receiving texts                        Male      1