Python 如何在Pandas中将行拆分为两列_Python_Pandas

Python 如何在Pandas中将行拆分为两列

python pandas

Python 如何在Pandas中将行拆分为两列,python,pandas,Python,Pandas,我有以下数据框： import pandas as pd df = pd.DataFrame({'probegene' : ['1431492_at Lipn', '1448678_at Fam118a','1452580_a_at Mrpl21'], '(5)foo.ID.LN.x2' : [130, 150,173], '(5)foo.ID.LN.x1' : [20.3, 25.3,3.1]}) 看起来是这样的：

我有以下数据框：

import pandas as pd
df = pd.DataFrame({'probegene' : ['1431492_at Lipn', '1448678_at Fam118a','1452580_a_at Mrpl21'],
                   '(5)foo.ID.LN.x2' : [130, 150,173],
                   '(5)foo.ID.LN.x1' : [20.3, 25.3,3.1]})

看起来是这样的：

In [21]: df
Out[21]:
   (5)foo.ID.LN.x1  (5)foo.ID.LN.x2            probegene
0             20.3              130      1431492_at Lipn
1             25.3              150   1448678_at Fam118a
2              3.1              173  1452580_a_at Mrpl21

我要做的是将

probegene

中的行拆分为两列，结果是：

probe           gene    (5)foo.ID.LN.x1  (5)foo.ID.LN.x2            
1431492_at      Lipn           20.3              130      
1448678_at      Fam118a        25.3              150   
1452580_a_at    Mrpl21          3.1              173

我怎样才能做到这一点

我被这件事困住了：

df['probegene'].str.split(' ')

我仍然不确定这是否是最好的方法，但是如果您对

拆分的结果应用（pd.Series）

，您将得到一个正确的索引帧。之后，您可以加入：

>>> new_cols = df.pop("probegene").str.split().apply(pd.Series)
>>> new_cols.columns = ["probe","gene"]
>>> df = df.join(new_cols)
>>> df
   (5)foo.ID.LN.x1  (5)foo.ID.LN.x2         probe     gene
0             20.3              130    1431492_at     Lipn
1             25.3              150    1448678_at  Fam118a
2              3.1              173  1452580_a_at   Mrpl21

我不确定这是否是最好的方法的原因是

apply

往往比较慢。差不多

pd.DataFrame.from_records(df["probegene"].str.split().tolist(), index=df.index)

可能会更快，以防出现瓶颈。

单线解决方案

df['probe'], df['gene'] = zip(*df['probegene'].str.split())