Python 将行拆分为熊猫中的多行_Python_Pandas

Python 将行拆分为熊猫中的多行

python pandas

Python 将行拆分为熊猫中的多行,python,pandas,Python,Pandas,我有一个数据帧，格式如下（简化）我想用下面的方式把它分开 a b 20 a b 20 a b 1 a b 1 a b 1 a c 20 a c 1 a c 1 其中，我有数字除以20的行数，然后是剩余行数。我有一个解决方案，基本上是迭代行并填充字典，然后将字典转换回Dataframe，但我想知道是否有更好的解决方案。您可以先使用带模的地板除法，然后通过构造函数创建新的Dataframe 对于C，列表理解的最后需要： a,b = df.C // 20, df.C

我有一个数据帧，格式如下（简化）

我想用下面的方式把它分开

a  b  20
a  b  20
a  b  1
a  b  1
a  b  1
a  c  20
a  c  1
a  c  1

其中，我有数字除以20的行数，然后是剩余行数。我有一个解决方案，基本上是迭代行并填充字典，然后将字典转换回Dataframe，但我想知道是否有更好的解决方案。

您可以先使用带模的地板除法，然后通过

构造函数创建新的Dataframe

对于C
，列表理解的最后需要：
a,b = df.C // 20, df.C % 20
#print (a, b)

cols = ['A','B']
df = pd.DataFrame({x: np.repeat(df[x], a + b) for x in cols})
df['C'] = np.concatenate([[20] * x + [1] * y for x,y in zip(a,b)])
print (df)
   A  B   C
0  a  b  20
0  a  b  20
0  a  b   1
0  a  b   1
0  a  b   1
1  a  c  20
1  a  c   1
1  a  c   1

设置
m = np.array([20, 1])
dm = list(zip(*np.divmod(df.C.values, m[0])))
# [(2, 3), (1, 2)]

rep = [sum(x) for x in dm]
new = np.concatenate([m.repeat(x) for x in dm])

df.loc[df.index.repeat(rep)].assign(C=new)

   A  B   C
0  a  b  20
0  a  b  20
0  a  b   1
0  a  b   1
0  a  b   1
1  a  c  20
1  a  c   1
1  a  c   1

考虑数据帧df

df = pd.DataFrame(dict(A=['a', 'a'], B=['b', 'c'], C=[43, 22]))
df

   A  B   C
0  a  b  43
1  a  c  22


和
m = np.array([20, 1])
dm = list(zip(*np.divmod(df.C.values, m[0])))
# [(2, 3), (1, 2)]

rep = [sum(x) for x in dm]
new = np.concatenate([m.repeat(x) for x in dm])

df.loc[df.index.repeat(rep)].assign(C=new)

   A  B   C
0  a  b  20
0  a  b  20
0  a  b   1
0  a  b   1
0  a  b   1
1  a  c  20
1  a  c   1
1  a  c   1

我得到一个ValueError:在np.repeat行尝试此操作时，操作数无法与shape（2，）（3810，）
一起广播。示例数据有问题吗？还是用真实数据？在真实数据解决方案中，仅更改数据？问题在于示例数据。当我尝试您的解决方案时，我得到了上述错误。您是否使用df=pd.DataFrame（dict（A=['A'，'A']，B=['B'，'c']，c=[43,22]）
？