Python 从另一个数据帧创建高级列
我有一个如下所示的数据帧:Python 从另一个数据帧创建高级列,python,pandas,dataframe,Python,Pandas,Dataframe,我有一个如下所示的数据帧: df_detail = car_brand car_type 0 Toyota Sedan 1 Toyota Truck 2 Honda Truck 3 Mazda Sedan 4 Mazda Convertible df_summary= ID car_brand count_Sedan count_Truck count_Convertible 0 Toyota 1 1
df_detail =
car_brand car_type
0 Toyota Sedan
1 Toyota Truck
2 Honda Truck
3 Mazda Sedan
4 Mazda Convertible
df_summary=
ID car_brand count_Sedan count_Truck count_Convertible
0 Toyota 1 1 0
1 Honda 0 1 0
2 Mazda 1 0 1
我想创建一个汇总数据框,如下所示:
df_detail =
car_brand car_type
0 Toyota Sedan
1 Toyota Truck
2 Honda Truck
3 Mazda Sedan
4 Mazda Convertible
df_summary=
ID car_brand count_Sedan count_Truck count_Convertible
0 Toyota 1 1 0
1 Honda 0 1 0
2 Mazda 1 0 1
有没有办法用熊猫创建计数列?
我尝试了以下方法:
import pandas as pd
d = {'car_brand':['Toyota','Toyota','Honda','Mazda','Mazda'],'car_type':['Sedan','Truck','Truck','Sedan','Convertible']}
df_detail = pd.DataFrame(data=d)
df_summary = pd.DataFrame({'car_brand':[]})
df_summary['car_brand'] = df_detail['car_brand'].unique()
df_summary['count_Sedan']=df_detail[((df_detail['car_brand']==df_summary['car_brand']) &
(df_detail['car_type']=='Sedan'))].count()
我得到一个错误:
ValueError: Can only compare identically-labeled Series objects
答复:
df_detail.set_index('car_brand')['car_type'].str.get_dummies().sum(level=0).add_prefix('count_').reset_index()
使用
.str.get\u假人尝试以下操作:
df_detail.set_index('car_brand')['car_type'].str.get_dummies()\
.sum(level=0).add_prefix('count_')
输出:
count_Convertible count_Sedan count_Truck
car_brand
Toyota 0 1 1
Honda 0 0 1
Mazda 1 1 0
car_brand count_Convertible count_Sedan count_Truck
0 Toyota 0 1 1
1 Honda 0 0 1
2 Mazda 1 1 0
和add.reset_index以获取整数形式的索引:
df_detail.set_index('car_brand')['car_type'].str.get_dummies()\
.sum(level=0).add_prefix('count_').reset_index()
输出:
count_Convertible count_Sedan count_Truck
car_brand
Toyota 0 1 1
Honda 0 0 1
Mazda 1 1 0
car_brand count_Convertible count_Sedan count_Truck
0 Toyota 0 1 1
1 Honda 0 0 1
2 Mazda 1 1 0
使用.str.get\u假人尝试以下操作:
df_detail.set_index('car_brand')['car_type'].str.get_dummies()\
.sum(level=0).add_prefix('count_')
输出:
count_Convertible count_Sedan count_Truck
car_brand
Toyota 0 1 1
Honda 0 0 1
Mazda 1 1 0
car_brand count_Convertible count_Sedan count_Truck
0 Toyota 0 1 1
1 Honda 0 0 1
2 Mazda 1 1 0
和add.reset_index以获取整数形式的索引:
df_detail.set_index('car_brand')['car_type'].str.get_dummies()\
.sum(level=0).add_prefix('count_').reset_index()
输出:
count_Convertible count_Sedan count_Truck
car_brand
Toyota 0 1 1
Honda 0 0 1
Mazda 1 1 0
car_brand count_Convertible count_Sedan count_Truck
0 Toyota 0 1 1
1 Honda 0 0 1
2 Mazda 1 1 0
谢谢有没有办法把索引作为整数而不是汽车品牌?谢谢。有没有一种方法可以将索引作为整数而不是汽车品牌?