Python 添加行的最优雅方式_Python_Pandas

Python 添加行的最优雅方式

python pandas

Python 添加行的最优雅方式,python,pandas,Python,Pandas,在数据框中添加行最优雅的方式是什么，如下所示： a b time 0 nan nan 8 1 nan nan 5 2 nan nan 3 进入：我尝试的是构建一个函数missing_times，它返回丢失时间的新数据帧，但在将两个数据库压缩在一起时遇到问题。解决此类问题最有效的方法是什么？您可以使用数据帧构造函数和：如果原始df中的值使用+++：在时间列中包含重复项的解决方案，包括：假设您要做的是以这样一种方式连接表，即两个表之间没有重

在数据框中添加行最优雅的方式是什么，如下所示：

     a    b   time
 0  nan  nan   8
 1  nan  nan   5
 2  nan  nan   3

进入：

我尝试的是构建一个函数

missing_times

，它返回丢失时间的新数据帧，但在将两个数据库压缩在一起时遇到问题。解决此类问题最有效的方法是什么？

您可以使用

数据帧

构造函数和：

如果原始df中的值使用+++：

在

时间

列中包含重复项的解决方案，包括：

假设您要做的是以这样一种方式连接表，即两个表之间没有重复的行（我称之为df1和df2），您可以使用：

df3 = pandas.merge(df1, df2, how='outer')
df3.sort_values(by='time', ascending=False)

这是我的方法，分为4个步骤：

将时间设置为索引
使用reindex创建缺少的条目
颠倒顺序，使max（时间）位于顶部
重置索引

代码：

我非常喜欢你的回答！但您正在构建新的数据帧，因此如果a或b中有一些值，则将替换为Nan。在这种情况下，您会继续使用这种代码还是构建其他代码？如果原始数据帧中的值相同，我将添加我的第一个解决方案。谢谢。很抱歉打扰您，但我刚刚发现我的数据帧中有重复项，所以reindex无法工作。你还有什么建议吗？非常感谢。是否可以按解决方案进行重复数据消除？

df = pd.DataFrame({'time':np.arange(df['time'].max() + 1)[::-1]})
       .reindex_axis(df.columns, axis=1)
print (df)
    a   b  time
0 NaN NaN     8
1 NaN NaN     7
2 NaN NaN     6
3 NaN NaN     5
4 NaN NaN     4
5 NaN NaN     3
6 NaN NaN     2
7 NaN NaN     1
8 NaN NaN     0

print (df)
   a  b  time
0  4  5     8
1  2  8     5
2  1  2     3


df = df.set_index('time')
       .reindex(np.arange(df['time'].max() + 1)[::-1])
       .reset_index()
       .reindex_axis(df.columns, axis=1)
print (df)
     a    b  time
0  4.0  5.0     8
1  NaN  NaN     7
2  NaN  NaN     6
3  2.0  8.0     5
4  NaN  NaN     4
5  1.0  2.0     3
6  NaN  NaN     2
7  NaN  NaN     1
8  NaN  NaN     0

print (df)
   a  b  time
0  4  5     8
1  2  3     8
2  1  2     3

df1 = pd.DataFrame({'time':np.arange(df['time'].max() + 1)[::-1]})
df = pd.merge(df,df1, how='outer').sort_values('time', ascending=False)
print (df)
     a    b  time
0  4.0  5.0     8
1  2.0  3.0     8
3  NaN  NaN     7
4  NaN  NaN     6
5  NaN  NaN     5
6  NaN  NaN     4
2  1.0  2.0     3
7  NaN  NaN     2
8  NaN  NaN     1
9  NaN  NaN     0

df3 = pandas.merge(df1, df2, how='outer')
df3.sort_values(by='time', ascending=False)

df.set_index('time')\
  .reindex(range(max(df['time']) + 1))\
  .sort_index(ascending = False)\
  .reset_index()