Python Matplotlib:如何从数据框创建堆叠条形图?
从以下几点开始Python Matplotlib:如何从数据框创建堆叠条形图?,python,numpy,matplotlib,stackedbarseries,Python,Numpy,Matplotlib,Stackedbarseries,从以下几点开始 df = pd.DataFrame( {'Item':['A','A','A','B','B','C','C','C','C'], 'Name': ['Tom','John','Paul','Tom','Frank','Tom', 'John', 'Richard', 'James'], 'Total':[3,3,3,2,2,4,4,4,4]}) print df Item Name 0 A Tom 1 A John 2
df = pd.DataFrame( {'Item':['A','A','A','B','B','C','C','C','C'],
'Name': ['Tom','John','Paul','Tom','Frank','Tom', 'John', 'Richard', 'James'],
'Total':[3,3,3,2,2,4,4,4,4]})
print df
Item Name
0 A Tom
1 A John
2 A Paul
3 B Tom
4 B Frank
5 C Tom
6 C John
7 C Richard
8 C James
#merge M:N by column Item
df1 = pd.merge(df, df, on=['Item'])
#remove duplicity - column Name_x == Name_y
df1 = df1[~(df1['Name_x'] == df1['Name_y'])]
#print df1
#create lists
df1 = df1.groupby('Name_x')['Name_y'].apply(lambda x: x.tolist()).reset_index()
print df1
Name_x Name_y
0 Frank [Tom]
1 James [Tom, John, Richard]
2 John [Tom, Paul, Tom, Richard, James]
3 Paul [Tom, John]
4 Richard [Tom, John, James]
5 Tom [John, Paul, Frank, John, Richard, James]
Name_x People
Frank (Tom,) 1
James (Tom, John, Richard) 3
John (Tom, Paul, Tom, Richard, James) 5
Paul (Tom, John) 2
Richard (Tom, John, James) 3
Tom (John, Paul, Frank, John, Richard, James) 6
dtype: int64
我有一个数据帧,如下所示:
print df
Name People times
0 Frank [Tom] [1]
1 James [John, Richard, Tom] [1, 1, 1]
2 John [James, Paul, Richard, Tom] [1, 1, 1, 2]
3 Paul [John, Tom] [1, 1]
4 Richard [James, John, Tom] [1, 1, 1]
5 Tom [Frank, James, John, Paul, Richard] [1, 1, 2, 1, 1]
我想为每个名称
创建一个堆叠条形图,将人
视为条形图,将时间
视为值
我想做这样的事情
sub_df = df.groupby(['Name','People'])['Times'].sum().unstack()
sub_df.plot(kind='bar',stacked=True)
但它又回来了
TypeError:不可损坏的类型:“numpy.ndarray”
您必须在
groupby
之后使用apply灵活类型的“agg”:
df1['People'] = df1['Name_y'].apply(lambda x: tuple(x))
df1['Times'] = df1['Name_y'].apply(lambda x: [x.count(name) for name in list(set(x))])
s = df1.groupby(['Name_x','People']).apply(lambda x: sum(x.iloc[0]['Times']))
然后你会得到以下结果
df = pd.DataFrame( {'Item':['A','A','A','B','B','C','C','C','C'],
'Name': ['Tom','John','Paul','Tom','Frank','Tom', 'John', 'Richard', 'James'],
'Total':[3,3,3,2,2,4,4,4,4]})
print df
Item Name
0 A Tom
1 A John
2 A Paul
3 B Tom
4 B Frank
5 C Tom
6 C John
7 C Richard
8 C James
#merge M:N by column Item
df1 = pd.merge(df, df, on=['Item'])
#remove duplicity - column Name_x == Name_y
df1 = df1[~(df1['Name_x'] == df1['Name_y'])]
#print df1
#create lists
df1 = df1.groupby('Name_x')['Name_y'].apply(lambda x: x.tolist()).reset_index()
print df1
Name_x Name_y
0 Frank [Tom]
1 James [Tom, John, Richard]
2 John [Tom, Paul, Tom, Richard, James]
3 Paul [Tom, John]
4 Richard [Tom, John, James]
5 Tom [John, Paul, Frank, John, Richard, James]
Name_x People
Frank (Tom,) 1
James (Tom, John, Richard) 3
John (Tom, Paul, Tom, Richard, James) 5
Paul (Tom, John) 2
Richard (Tom, John, James) 3
Tom (John, Paul, Frank, John, Richard, James) 6
dtype: int64
你可以随意作图
s.plot(kind='bar', stacked=True)
您必须在
groupby
之后使用apply灵活类型的“agg”:
df1['People'] = df1['Name_y'].apply(lambda x: tuple(x))
df1['Times'] = df1['Name_y'].apply(lambda x: [x.count(name) for name in list(set(x))])
s = df1.groupby(['Name_x','People']).apply(lambda x: sum(x.iloc[0]['Times']))
然后你会得到以下结果
df = pd.DataFrame( {'Item':['A','A','A','B','B','C','C','C','C'],
'Name': ['Tom','John','Paul','Tom','Frank','Tom', 'John', 'Richard', 'James'],
'Total':[3,3,3,2,2,4,4,4,4]})
print df
Item Name
0 A Tom
1 A John
2 A Paul
3 B Tom
4 B Frank
5 C Tom
6 C John
7 C Richard
8 C James
#merge M:N by column Item
df1 = pd.merge(df, df, on=['Item'])
#remove duplicity - column Name_x == Name_y
df1 = df1[~(df1['Name_x'] == df1['Name_y'])]
#print df1
#create lists
df1 = df1.groupby('Name_x')['Name_y'].apply(lambda x: x.tolist()).reset_index()
print df1
Name_x Name_y
0 Frank [Tom]
1 James [Tom, John, Richard]
2 John [Tom, Paul, Tom, Richard, James]
3 Paul [Tom, John]
4 Richard [Tom, John, James]
5 Tom [John, Paul, Frank, John, Richard, James]
Name_x People
Frank (Tom,) 1
James (Tom, John, Richard) 3
John (Tom, Paul, Tom, Richard, James) 5
Paul (Tom, John) 2
Richard (Tom, John, James) 3
Tom (John, Paul, Frank, John, Richard, James) 6
dtype: int64
你可以随意作图
s.plot(kind='bar', stacked=True)
你能把你的例子重写成易于执行的吗?用哪种方式重写?@emax编写它,这样我们就可以将你的代码复制/粘贴到python中,生成你的数据帧
df
,这样我们就不必手动创建它了。@Suever我更改了描述。这样行吗?@roadrunner66行吗?你能把你的例子重写成易于执行的吗?用哪种方式重写?@emax编写它,这样我们就可以将你的代码复制/粘贴到python中,生成你的数据帧df
,这样我们就不必手动创建它了。@Suever我更改了描述。这样行吗?@roadrunner66行吗?