Python Can'；在dataframe中使用group by函数时找不到列名_Python_Pandas_Dataframe_Group By_Pandas Groupby

Python Can'；在dataframe中使用group by函数时找不到列名

python pandas dataframe

Python Can'；在dataframe中使用group by函数时找不到列名,python,pandas,dataframe,group-by,pandas-groupby,Python,Pandas,Dataframe,Group By,Pandas Groupby,我有一个熊猫数据框，看起来像这样： Country City POI Type 0 NL Amsterdam KFC restaurant 1 NL Amsterdam KFC cafe 2 NL Arnhem McDonalds fast food 3 NL Arnhem McDonalds ice cream Country

我有一个熊猫数据框，看起来像这样：

   Country    City        POI       Type
0   NL       Amsterdam    KFC       restaurant
1   NL       Amsterdam    KFC       cafe
2   NL       Arnhem     McDonalds   fast food
3   NL       Arnhem     McDonalds   ice cream

   Country    City        POI       Type
0   NL       Amsterdam    KFC       restaurant, cafe
1   NL       Arnhem     McDonalds   fast food, ice cream

我需要按类型列分组，以便在所有其他列中不存在重复项。换句话说，我需要这样的输出：

   Country    City        POI       Type
0   NL       Amsterdam    KFC       restaurant
1   NL       Amsterdam    KFC       cafe
2   NL       Arnhem     McDonalds   fast food
3   NL       Arnhem     McDonalds   ice cream

   Country    City        POI       Type
0   NL       Amsterdam    KFC       restaurant, cafe
1   NL       Arnhem     McDonalds   fast food, ice cream

我尝试使用GROUPBY函数，但所有列名都消失了，而shape函数显示0列。也许有更好的方法来分组这些价值观

下面是一个示例代码：

import pandas as pd
import numpy as np
data = np.array([['','Country','City', 'POI', 'Type'],
            [0,"NL","Amsterdam", 'KFC', 'cafe'],
            [1,"NL","Amsterdam", 'KFC', 'restaurant'],
            [2,"NL","Arnhem", 'McDonalds', 'fast-food'],
            [3,"NL","Arnhem", 'McDonalds', 'ice cream']]
           )

initial_df = pd.DataFrame(data=data[1:,1:],
              index=data[1:,0],
              columns=data[0,1:])

final_df = initial_df .groupby( [ "Country", "City", "POI", "Type"] ).count()

print(list(final_df.columns.values))
print(final_df.shape)

您可以分组到

str.join

：

res = df.groupby(['Country', 'City', 'POI'])['Type'].apply(', '.join).reset_index()

print(res)

  Country       City        POI                Type
0      NL  Amsterdam        KFC    restaurant, cafe
1      NL     Arnhem  McDonalds  fastfood, icecream

您的

final_df

为空，因为您要求

pandas

按所有列分组。如果您只想按列进行分组

“Type”

，请执行以下操作：

grouped = initial_df .groupby( ["Type"] )

然后将

count（）

函数应用于分组的数据帧。这将统计每个组的每个列中非

nan

元素的实例。不过，您要做的是访问每个组。您可以这样做：

for name, group in grouped:
   print(name) # this prints the Type of your group
   print(group) # this prints the dataframe corrisponging to your Type

希望这有帮助。

谢谢jpp，这很有效。