Python 按标签系列重新索引数据框列_Python_Pandas_Dataframe_Indexing_Reindex

Python 按标签系列重新索引数据框列

python pandas dataframe indexing

Python 按标签系列重新索引数据框列,python,pandas,dataframe,indexing,reindex,Python,Pandas,Dataframe,Indexing,Reindex,我有一系列的标签 pd.Series(['L1', 'L2', 'L3'], ['A', 'B', 'A']) 和一个数据帧 pd.DataFrame([[1,2], [3,4]], ['I1', 'I2'], ['A', 'B']) 我想要一个数据框，其中列['L1'，'L2'，'L3']分别来自'a'，'B'，'a'的列数据。就像这样 pd.DataFrame([[1,2,1], [3,4,3]], ['I1', 'I2'], ['L1', 'L2', 'L3']) 以一种很好的方式。

我有一系列的标签

pd.Series(['L1', 'L2', 'L3'], ['A', 'B', 'A'])

和一个数据帧

pd.DataFrame([[1,2], [3,4]], ['I1', 'I2'], ['A', 'B'])

我想要一个数据框，其中列

['L1'，'L2'，'L3']

分别来自'a'，'B'，'a'的列数据。就像这样

pd.DataFrame([[1,2,1], [3,4,3]], ['I1', 'I2'], ['L1', 'L2', 'L3'])

以一种很好的方式。

这将生成您描述的数据帧：

import pandas as pd
import numpy as np

data = [['A','B','A','A','B','B'],
        ['B','B','B','A','B','B'],
        ['A','B','A','B','B','B']]

columns = ['L1', 'L2', 'L3', 'L4', 'L5', 'L6']

pd.DataFrame(data, columns = columns)

因为您提到了

reindex

#s=pd.Series(['L1', 'L2', 'L3'], ['A', 'B', 'A'])
#df=pd.DataFrame([[1,2], [3,4]], ['I1', 'I2'], ['A', 'B'])
df.reindex(s.index,axis=1).rename(columns=s.to_dict())
Out[598]: 
    L3  L2  L3
I1   1   2   1
I2   3   4   3

您可以使用

loc

访问器：

s = pd.Series(['L1', 'L2', 'L3'], ['A', 'B', 'A'])
df = pd.DataFrame([[1,2], [3,4]], ['I1', 'I2'], ['A', 'B'])

res = df.loc[:, s.index]

print(res)

    A  B  A
I1  1  2  1
I2  3  4  3

或

iloc

访问或使用

列。获取loc

：

res = df.iloc[:, s.index.map(df.columns.get_loc)]

这两种方法都允许访问重复的标签/位置，与NumPy数组的方式相同。

是否创建示例数据并显示预期结果？希望这有助于澄清。真正的问题是有很多标签，而且是一个较大的数据帧。我认为reindex是正确的解决方案，但我似乎无法用正确的方法编写。如果没有reindex，有一种更干净的方法，我很乐意使用它，但这看起来很棒。@rhasket

df.loc[：，s.index]。重命名（columns=s.to_dict（））