Python 如何使用“get_loc”在多索引中查找多个列的索引_Python_Python 3.x_Pandas_Dataframe

Python 如何使用“get_loc”在多索引中查找多个列的索引

python python-3.x pandas dataframe

Python 如何使用“get_loc”在多索引中查找多个列的索引,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我有以下电子表格：其中包含检索到的双标题多索引： df = pd.read_excel('myspreadsheet.xlsx', header=[0,1]) 并且希望在不显式编写“Stage”的情况下获取每个“Time”列的索引但是，这样做不适用于多索引： df.columns.get_loc('Time') 我正在寻找[1,4,7]或[0,3,6]之类的输出来表示“Time”的列位置。我想类似于['Stage 1'，0'，Stage 2'，3'，Stage 3'，6]的东西也可以。我

我有以下电子表格：

其中包含检索到的双标题多索引：

df = pd.read_excel('myspreadsheet.xlsx', header=[0,1])

并且希望在不显式编写“Stage”的情况下获取每个“Time”列的索引

但是，这样做不适用于多索引：

df.columns.get_loc('Time')

我正在寻找[1,4,7]或[0,3,6]之类的输出来表示“Time”的列位置。我想类似于['Stage 1'，0'，Stage 2'，3'，Stage 3'，6]的东西也可以。我的最终目标是在每个“时间”列之后插入来自另一个df的10个相同列

如何才能做到这一点？

我相信您需要通过以下方式使用随时间变化的专栏：

详情：

对于元组列表：

c = df.xs('Time', drop_level=False, axis=1, level=1).columns
b = list(zip(c.get_level_values(0), df.columns.get_indexer(c)))
print (b)
[('Stage 1', 0), ('Stage 2', 3), ('Stage 3', 6)]

编辑：

对于新专栏，需要一些数学知识：

c = df.xs('Time', drop_level=False, axis=1, level=1).columns
indices_list = list(zip(c.get_level_values(0), df.columns.get_indexer(c)))
print (indices_list)
[('Stage 1', 0), ('Stage 2', 3), ('Stage 3', 6)]

lenlevel1 = len(c.levels[1])
for j, (s, i) in enumerate(indices_list):  
    df.insert(int(i)+(j * 3)+lenlevel1, (s, 'Depth'), 10)
    df.insert(int(i)+(j * 3)+lenlevel1+1, (s, 'Volume'), 20)
    df.insert(int(i)+(j * 3)+lenlevel1+2, (s, 'Radius'), 30)

我相信您需要通过以下方式与时间相关的专栏：

详情：

对于元组列表：

c = df.xs('Time', drop_level=False, axis=1, level=1).columns
b = list(zip(c.get_level_values(0), df.columns.get_indexer(c)))
print (b)
[('Stage 1', 0), ('Stage 2', 3), ('Stage 3', 6)]

编辑：

对于新专栏，需要一些数学知识：

c = df.xs('Time', drop_level=False, axis=1, level=1).columns
indices_list = list(zip(c.get_level_values(0), df.columns.get_indexer(c)))
print (indices_list)
[('Stage 1', 0), ('Stage 2', 3), ('Stage 3', 6)]

lenlevel1 = len(c.levels[1])
for j, (s, i) in enumerate(indices_list):  
    df.insert(int(i)+(j * 3)+lenlevel1, (s, 'Depth'), 10)
    df.insert(int(i)+(j * 3)+lenlevel1+1, (s, 'Volume'), 20)
    df.insert(int(i)+(j * 3)+lenlevel1+2, (s, 'Radius'), 30)

数据帧

    id  Time    duration
0   1   234 65
1   2   546 779
2   3   353 567
3   4   456 865

产出

数据帧

    id  Time    duration
0   1   234 65
1   2   546 779
2   3   353 567
3   4   456 865

产出

预期结果是什么？@jezrael用你非常合理的要求更新了问题=预期结果是什么？@jezrael用你非常合理的要求更新了问题=这看起来非常完美，谢谢！只是想知道你是否认为这种获取列索引的方法是合理的，如我的问题编辑中所述，在哪里插入10个新列？类似于索引列表中的for i：df.inserti，['Depth'，'Volume'，'Radius'，…]where Depth，Volume，Radius，等是新列的名称。感谢您提供插入代码！结果比我预想的要复杂，但我会尝试破译它。再次感谢你，耶斯雷尔-希望我能给你更多的选票，你会得到的！这看起来很完美，谢谢！只是想知道你是否认为这种获取列索引的方法是合理的，如我的问题编辑中所述，在哪里插入10个新列？类似于索引列表中的for i：df.inserti，['Depth'，'Volume'，'Radius'，…]where Depth，Volume，Radius，等是新列的名称。感谢您提供插入代码！结果比我预想的要复杂，但我会尝试破译它。再次感谢你，耶斯雷尔-希望我能给你更多的选票，你会得到的！我没有看到输出。在编辑问题之前，我已经写下了我的答案。你说得对。我没有看到输出。在编辑问题之前，我已经写下了我的答案。你是对的。