Python 数据透视\将行转置到列标题_Python_Pandas

Python 数据透视\将行转置到列标题

python pandas

Python 数据透视\将行转置到列标题,python,pandas,Python,Pandas,我正在努力学习熊猫，想知道如何才能做到以下几点要从以下数据帧开始： df = pd.DataFrame({ 'Name': ['Person1', 'Person1'], 'SetCode1': ['L6A', 'L6A'], 'SetDetail1': ['B', 'C'], 'SetCode2': ['G2G', 'G2G'], 'SetDetail2': ['B', 'B'], }) 我认为这更像是一个列重命名，而不是旋转。这是我的密码 code

我正在努力学习熊猫，想知道如何才能做到以下几点

要从以下数据帧开始：

df = pd.DataFrame({
    'Name': ['Person1', 'Person1'],
    'SetCode1': ['L6A', 'L6A'],
    'SetDetail1': ['B', 'C'],
    'SetCode2': ['G2G', 'G2G'],
    'SetDetail2': ['B', 'B'],
})

我认为这更像是一个列重命名，而不是旋转。这是我的密码

code_cols = list(filter(lambda s: s.startswith('SetCode'), df.columns))
det_cols = list(filter(lambda s: s.startswith('SetDetail'), df.columns))
codes = [df[s][0] for s in code_cols]
df.rename(columns = dict(zip(det_cols, codes)), inplace=True)
df.drop(columns = code_cols, inplace=True)
df

产生

    Name    L6A G2G
0   Person1 B   B
1   Person1 C   B

感谢@Sander van den Oord在数据框中输入

我认为这更像是一种列重命名，而不是数据透视。这是我的密码

code_cols = list(filter(lambda s: s.startswith('SetCode'), df.columns))
det_cols = list(filter(lambda s: s.startswith('SetDetail'), df.columns))
codes = [df[s][0] for s in code_cols]
df.rename(columns = dict(zip(det_cols, codes)), inplace=True)
df.drop(columns = code_cols, inplace=True)
df

产生

    Name    L6A G2G
0   Person1 B   B
1   Person1 C   B

感谢@Sander van den Oord在数据框中输入

尝试使用pd.wide\u to\u long和unstack：

输出：

SetCode       G2G L6A
index Name           
0     Person1   B   B
1     Person1   B   C

尝试使用pd.wide\u到长并取消堆叠：

输出：

SetCode       G2G L6A
index Name           
0     Person1   B   B
1     Person1   B   C

使用pandas.wide_to_long是正确的解决方案，但必须谨慎使用某些列中的NaN值

因此，下面是对Scott Boston答案的改编：

import pandas as pd

# I just allowed myself to write 'Person2' instead of 'Person1' at the second row
# of the DataFrame, as I imagine this is what was originally intended in the data,
# but this does not change the method
df = pd.DataFrame({
    'Name': ['Person1', 'Person2'],
    'SetCode1': ['L6A', 'L6A'],
    'SetDetail1': ['B', 'C'],
    'SetCode6': ['U2H', None],
    'SetDetail6': ['B', None],
})
print(df)

      Name SetCode1 SetDetail1 SetCode6 SetDetail6
0  Person1      L6A          B      U2H          B
1  Person2      L6A          C     None       None

# You will need to use reset_index to keep the original index moving forward only if
# the 'Name' column does not have unique values
df_melt = pd.wide_to_long(df, ['SetCode', 'SetDetail'], ['Name'], 'No')

df_out = df_melt[df_melt['SetCode'].notnull()]\
    .set_index('SetCode', append=True)\
    .reset_index(level=1, drop=True)['SetDetail']\
    .unstack()
print(df_out)

SetCode L6A  U2H
Name            
Person1   B    B
Person2   C  NaN

使用pandas.wide_to_long是正确的解决方案，但必须谨慎使用某些列中的NaN值

因此，下面是对Scott Boston答案的改编：

import pandas as pd

# I just allowed myself to write 'Person2' instead of 'Person1' at the second row
# of the DataFrame, as I imagine this is what was originally intended in the data,
# but this does not change the method
df = pd.DataFrame({
    'Name': ['Person1', 'Person2'],
    'SetCode1': ['L6A', 'L6A'],
    'SetDetail1': ['B', 'C'],
    'SetCode6': ['U2H', None],
    'SetDetail6': ['B', None],
})
print(df)

      Name SetCode1 SetDetail1 SetCode6 SetDetail6
0  Person1      L6A          B      U2H          B
1  Person2      L6A          C     None       None

# You will need to use reset_index to keep the original index moving forward only if
# the 'Name' column does not have unique values
df_melt = pd.wide_to_long(df, ['SetCode', 'SetDetail'], ['Name'], 'No')

df_out = df_melt[df_melt['SetCode'].notnull()]\
    .set_index('SetCode', append=True)\
    .reset_index(level=1, drop=True)['SetDetail']\
    .unstack()
print(df_out)

SetCode L6A  U2H
Name            
Person1   B    B
Person2   C  NaN

首先，我添加了一个数据帧示例，因此其他人可能会帮助您。我不知道这个问题的答案，但我相信其他人会的。最好的做法是在代码中添加示例数据，而不是照片！为了让其他人更容易回答你的问题，我先添加了一个数据框架的示例，也许其他人可以帮助你。我不知道这个问题的答案，但我相信其他人会的。最好的做法是在代码中添加示例数据，而不是照片！让别人更容易回答你的问题。