Python 使用Pandas向数据帧子集添加新行
我有以下数据帧:Python 使用Pandas向数据帧子集添加新行,python,pandas,dataframe,Python,Pandas,Dataframe,我有以下数据帧: Customer ProductID Count John 1 25 John 6 50 Mary 2 15 Mary 3 35 我希望我的输出如下所示: Customer ProductID Count John 1 25 John 2 0 John 3 0 John 6 50 M
Customer ProductID Count
John 1 25
John 6 50
Mary 2 15
Mary 3 35
我希望我的输出如下所示:
Customer ProductID Count
John 1 25
John 2 0
John 3 0
John 6 50
Mary 1 0
Mary 2 15
Mary 3 35
Mary 6 0
我试图做的是从数据帧中识别唯一的ProductID
unique_ID = pd.unique(df.ProductID.ravel())
print (unique_ID) = array([1,6,2,3])
由于客户John不存在ProductID
2,3,因此我将按客户名称拆分数据帧
df1 = df[df['Customer']=='John']
df2 = df[df['Customer']=='Mary']
打印df1
Customer ProductID Count
John 1 25
John 6 50
打印df2
Customer ProductID Count
Mary 2 15
Mary 3 35
我想为John添加ProductID
2,3,为Mary添加ProductID
1,6,并为这些ProductID
设置Count
为0,如上面所示。我认为您可以使用-您可以获得NaN
值,这些值由0
确定,最后需要df
的原始形状-与一起使用:
另一种解决方案-首先获取列的值(列的ProductID
),然后通过此多索引创建和df
:
a = df.Customer.unique()
b = df.ProductID.sort_values().unique()
print (a)
['John' 'Mary']
print (b)
[1 2 3 6]
m = pd.MultiIndex.from_product([a,b])
print (m)
MultiIndex(levels=[['John', 'Mary'], [1, 2, 3, 6]],
labels=[[0, 0, 0, 0, 1, 1, 1, 1], [0, 1, 2, 3, 0, 1, 2, 3]])
df1 = df.set_index(['Customer','ProductID']).reindex(m, fill_value=0).reset_index()
df1.columns = ['Customer','ProductID','Count']
print (df1)
Customer ProductID Count
0 John 1 25
1 John 2 0
2 John 3 0
3 John 6 50
4 Mary 1 0
5 Mary 2 15
6 Mary 3 35
7 Mary 6 0
a = df.Customer.unique()
b = df.ProductID.sort_values().unique()
print (a)
['John' 'Mary']
print (b)
[1 2 3 6]
m = pd.MultiIndex.from_product([a,b])
print (m)
MultiIndex(levels=[['John', 'Mary'], [1, 2, 3, 6]],
labels=[[0, 0, 0, 0, 1, 1, 1, 1], [0, 1, 2, 3, 0, 1, 2, 3]])
df1 = df.set_index(['Customer','ProductID']).reindex(m, fill_value=0).reset_index()
df1.columns = ['Customer','ProductID','Count']
print (df1)
Customer ProductID Count
0 John 1 25
1 John 2 0
2 John 3 0
3 John 6 50
4 Mary 1 0
5 Mary 2 15
6 Mary 3 35
7 Mary 6 0