Python 逻辑和数据帧在一起_Python_Pandas_Date_Dataframe

Python 逻辑和数据帧在一起

python pandas date dataframe

Python 逻辑和数据帧在一起,python,pandas,date,dataframe,Python,Pandas,Date,Dataframe,我有一系列七个相同长度的数据帧日期1看起来像： month day year 0 04 20 2009 1 04 20 09 2 4 20 09 3 4 3 09 4 NaN NaN NaN 5 NaN NaN NaN 6 NaN NaN NaN 7 NaN NaN NaN 8 NaN NaN NaN ... month day year

我有一系列七个相同长度的数据帧

日期1看起来像：

   month  day  year
0     04   20  2009
1     04   20    09
2      4   20    09
3      4    3    09
4    NaN  NaN   NaN
5    NaN  NaN   NaN
6    NaN  NaN   NaN
7    NaN  NaN   NaN
8    NaN  NaN   NaN
...

   month  day  year
0    NaN  NaN   NaN
1    NaN  NaN   NaN
2    NaN  NaN   NaN
3    NaN  NaN   NaN
4    Mar   20  2009
5    Mar   20  2009
6    Mar   20  2009
7    Mar   20  2009
8    Mar   20  2009
...

dates2看起来像：

   month  day  year
0     04   20  2009
1     04   20    09
2      4   20    09
3      4    3    09
4    NaN  NaN   NaN
5    NaN  NaN   NaN
6    NaN  NaN   NaN
7    NaN  NaN   NaN
8    NaN  NaN   NaN
...

   month  day  year
0    NaN  NaN   NaN
1    NaN  NaN   NaN
2    NaN  NaN   NaN
3    NaN  NaN   NaN
4    Mar   20  2009
5    Mar   20  2009
6    Mar   20  2009
7    Mar   20  2009
8    Mar   20  2009
...

等等，直到现在。我想创建一个数据框架，将它们合并在一起，但合并似乎不适合我

以下是我目前正在做的事情：

alldates = pd.concat([dates1,dates2,dates3,dates4], axis=0)
return alldates.dropna()

这是可行的，但一旦我加上Date5，Date6，Date7，它就会变得一团糟，因为这些数据帧有一些行的索引值与AllDate相同

这件事难倒我了。我还需要提供哪些信息？有没有更优雅的方法来解决这个问题？

您可以像这样使用append：

dataframes = [dates1,dates2,dates3,dates4]
alldates = pd.DataFrame() 
for dataframe in dataframes:
    alldates = alldates.append(dataframe)
return alldates.dropna()

也许你只需要在连接后删除重复索引，只保留第一个索引，即

alldates = pd.concat([dates1,dates2,dates3,dates4], axis=0).dropna()
alldates = alldates.loc[~alldates.index.duplicated(keep='first')]

print(alldates)

month  day  year
0     4   20  2009
1     4   20     9
2     4   20     9
3     4    3     9
4   Mar   20  2009
5   Mar   20  2009
6   Mar   20  2009
7   Mar   20  2009
8   Mar   20  2009

如果您试图从其他数据帧填充nan值，则可以使用

adf = df.fillna(df2)

对于两个以上的数据帧

l = [dates1,dates2]

for i in range(len(l)-1):
    ndf = l[i]
    ndf = ndf.fillna(l[i+1])

这与我发布的原始解决方案一样有效，但它仍然给我留下了带有重复索引的行的问题。我得到的结果包括重复的行，比如两个不同的索引为9的行，而我想要的是“如果已经有索引为9的行，请不要用索引9追加这一新行”。@Bharath我不明白你的建议。如果使用append，那么它会将一个数据帧的所有行添加到另一个数据帧，如果他将首先删除Nan行，那么它应该可以工作，如果一个相同的行被复制，他可以在末尾使用drop_duplicates。当

l=[dates1，dates2]

时，它可以工作，但是当

l=[dates1，dates2，dates3]

时，循环会在结果数据框中关闭date1中的值。您是否尝试删除重复项？更新后的解决方案有效吗？您提供的更新后的解决方案似乎有效。谢谢