Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/313.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将多个列合并到新列中_Python_Pandas_Analysis - Fatal编程技术网

Python 将多个列合并到新列中

Python 将多个列合并到新列中,python,pandas,analysis,Python,Pandas,Analysis,我有一个数据框,其中一些列表示是否看到了一组调查问题。例如: Q1_Seen Q2_Seen Q3_Seen Q4_Seen Q1a nan nan nan nan Q2a nan nan nan nan Q3d nan nan Q2c nan nan 我想将这些列折叠成一列,比

我有一个数据框,其中一些列表示是否看到了一组调查问题。例如:

Q1_Seen    Q2_Seen    Q3_Seen    Q4_Seen
    Q1a        nan        nan        nan
    nan        Q2a        nan        nan
    nan        nan        Q3d        nan
    nan        Q2c        nan        nan
我想将这些列折叠成一列,比如说
Q_Seen
,其形式如下:

Q_Seen
   Q1a
   Q2a
   Q3d
   Q2c
请注意,每一行都是互斥的:如果其中一列中有一个值,则所有其他列都是NaN


我试着用pd.concat做这件事,但似乎没有产生正确的结果。

以下方法对我有效:

df = pd.DataFrame({'Q1': [1, None, None], 'Q2': [None, 2, None], 'Q3': [None, None, 3]})
df['Q'] = df.concat([df['Q1'], df['Q2'], df['Q3']]).dropna()
可能有一个更优雅的解决方案,但这是我第一次想到的。

试试这个:

df['Q_Seen'] = df.stack().values

>>> df

Q1_Seen    Q2_Seen    Q3_Seen     Q4_Seen     Q_Seen
    Q1a        nan        nan         nan        Q1a
    nan        Q2a        nan         nan        Q2a
    nan        nan        Q3d         nan        Q3d
    nan        Q2c        nan         nan        Q2c

使用列方式的
max()
——即
max(axis=1)
——将允许您将所有值折叠为一列:

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"Q1_Seen": ['Q1a', None, None, None], "Q2_Seen": [None, "Q2a", None, "Q2c"], "Q3_Seen": [None, None, "Q3d", None],"Q4_Seen": [None, None, None, None]})

In [3]: df
Out[3]: 
  Q1_Seen Q2_Seen Q3_Seen Q4_Seen
0     Q1a    None    None    None
1    None     Q2a    None    None
2    None    None     Q3d    None
3    None     Q2c    None    None

In [4]: df['Q_Seen'] = df.max(axis=1)

In [5]: df
Out[5]: 
  Q1_Seen Q2_Seen Q3_Seen Q4_Seen Q_Seen
0     Q1a    None    None    None    Q1a
1    None     Q2a    None    None    Q2a
2    None    None     Q3d    None    Q3d
3    None     Q2c    None    None    Q2c