Python Pandas-基于列对行进行分组，并用非空值替换NaN_Python_Pandas_Pandas Groupby

Python Pandas-基于列对行进行分组，并用非空值替换NaN

python pandas

Python Pandas-基于列对行进行分组，并用非空值替换NaN,python,pandas,pandas-groupby,Python,Pandas,Pandas Groupby,我试图在我的数据帧上创建一些基于目标“groupby”列的字符串聚合假设我有以下4列数据框：我想根据列“Col1”对所有行进行分组，如果是o-NaN组，则使用不为null的值所需的输出如下所示：我还尝试使用一个普通的： import pandas as pd from tabulate import tabulate df = pd.DataFrame({'Col1': ['A', 'B', 'A'], 'Col2': ['X', 'Z', '

我试图在我的数据帧上创建一些基于目标“groupby”列的字符串聚合

假设我有以下4列数据框：

我想根据列“Col1”对所有行进行分组，如果是o-NaN组，则使用不为null的值

所需的输出如下所示：

我还尝试使用一个普通的：

import pandas as pd
from tabulate import tabulate

df = pd.DataFrame({'Col1': ['A', 'B', 'A'],
                   'Col2': ['X', 'Z', 'X'],
                   'Col3': ['Y', 'D', ''],
                   'Col4': ['', 'E', 'V'],})

print(tabulate(df, headers='keys', tablefmt='psql'))
df2 = df.groupby(['Col1'])
print(tabulate(df2, headers='keys', tablefmt='psql'))

但它没有将NaN值分组

我该怎么做

谢谢

如果可能，只需询问每组使用的第一个非缺失值：

只需在已经启动的数据帧上使用df.replace（）将它们替换为np.nan

df.replace('', np.nan)

使用

first（）。另一种不那么酷的方法是：
df.replace('', np.nan) \
.groupby('Col1', as_index=False) \
.fillna(method='bfill') \
.groupby('Col1') \
.nth(0)

输出：
Col1    Col2    Col3    Col4
A   X   Y   V
B   Z   D   E

Col1    Col2    Col3    Col4
A   X   Y   V
B   Z   D   E

甚至您也可以使用head（）
而不是nth（）
：
输出：
Col1    Col2    Col3    Col4
A   X   Y   V
B   Z   D   E

Col1    Col2    Col3    Col4
A   X   Y   V
B   Z   D   E

“Col4”的输入中的“E”发生了什么？抱歉，这是一个错误，它应该显示为“Col4”的“E”