Python 哪一种是使数据帧变平的最有效的方法?

Python 哪一种是使数据帧变平的最有效的方法?,python,python-3.x,pandas,Python,Python 3.x,Pandas,我有一个大熊猫数据框,有8列和几个NaN值: 0 1 2 3 4 5 6 7 8 1 Google, Inc. (Date 11/07/2016) NaN NaN NaN NaN NaN NaN NaN NaN 2 Apple Inc. (Date 07/01/2016) Amazon (Date 11/01/2016) NaN NaN NaN NaN N

我有一个大熊猫数据框,有8列和几个
NaN
值:

0   1   2   3   4   5   6   7   8
1   Google, Inc. (Date 11/07/2016)  NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
2   Apple Inc. (Date 07/01/2016)    Amazon (Date 11/01/2016)    NaN     NaN     NaN     NaN     NaN     NaN     NaN
3   IBM, Inc. (Date 11/08/2016)     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
4   Microsoft (Date 11/10/2016)     Google, Inc. (Date 11/10/1990)  Google, Inc. (Date 11/07/2016)  Samsung (Date 05/02/2016)   NaN     NaN     NaN     NaN     NaN
我怎样才能像这样把它压平:

0   companies
1   Google, Inc. (Date 11/07/2016)
2   Apple Inc. (Date 07/01/2016)
3   Amazon (Date 11/01/2016)
4   IBM, Inc. (Date 11/08/2016)
5   Microsoft (Date 11/10/2016)
6   Google, Inc. (Date 11/10/1990)
7   Google, Inc. (Date 11/07/2016)
8   Samsung (Date 05/02/2016)
我读了这本书,试着:

df.iloc[:,0]

问题是我丢失了其他列的信息和顺序。我想知道如何在其他单元格中展开而不丢失数据并进行排序?

这可能会起作用:

df = pd.DataFrame([
        ["Google, Inc. (Date 11/07/2016)", float("NaN")], 
        ["Apple Inc. (Date 07/01/2016)", "Amazon (Date 11/01/2016)"]])
unstacked = df.T.unstack()
unstacked.dropna(inplace=True)
unstacked.reset_index(drop=True, inplace=True)
unstacked
输出:

0    Google, Inc. (Date 11/07/2016)
1      Apple Inc. (Date 07/01/2016)
2          Amazon (Date 11/01/2016)
dtype: object

请注意,看一下如何在问题中提供好的示例。

您可以堆叠列,也可以选择重置索引。默认情况下,堆栈会删除NaN的

df.stack()
Out: 
0  0    Google, Inc. (Date 11/07/2016) 
1  0      Apple Inc. (Date 07/01/2016) 
   1          Amazon (Date 11/01/2016) 
2  0       IBM, Inc. (Date 11/08/2016) 
3  0       Microsoft (Date 11/10/2016) 
   1    Google, Inc. (Date 11/10/1990) 
   2    Google, Inc. (Date 11/07/2016) 
   3         Samsung (Date 05/02/2016) 
dtype: object

df.stack().reset_index(drop=True)
Out: 
0    Google, Inc. (Date 11/07/2016) 
1      Apple Inc. (Date 07/01/2016) 
2          Amazon (Date 11/01/2016) 
3       IBM, Inc. (Date 11/08/2016) 
4       Microsoft (Date 11/10/2016) 
5    Google, Inc. (Date 11/10/1990) 
6    Google, Inc. (Date 11/07/2016) 
7         Samsung (Date 05/02/2016) 
dtype: object

看来@ayhan的答案更好,谢谢你的帮助。如果我有兴趣在堆栈中保留nan空间呢?。。我应该做什么:
drop=False
?您必须删除未堆叠的行
。dropna(inplace=True)
。谢谢您的帮助。如果我有兴趣在堆栈中保留nan空间呢?。。我应该做什么:
drop=False
该drop用于删除索引。相反,您应该使用
df.stack(dropna=False)
来保留NAN。谢谢,我得到了:
AttributeError:'Series'对象没有属性“stack”
您在这里发布的原始数据帧上尝试过吗?为了得到那个错误,你应该在一个序列上调用那个方法。是的,这必须是原始的。