Python 熊猫:按组ID逐行填充NaN值
我正在尝试根据组ID逐行填充NaN值 我尝试过使用fillNA,使用正向和反向填充选项,但fillNA函数不会逐行填充数据帧。此外,我希望确保在填写NaN值之前公司匹配。在这种情况下,使用正向填充将导致公司“Pear”中填充来自公司“Banana”的数据Python 熊猫:按组ID逐行填充NaN值,python,pandas,fillna,Python,Pandas,Fillna,我正在尝试根据组ID逐行填充NaN值 我尝试过使用fillNA,使用正向和反向填充选项,但fillNA函数不会逐行填充数据帧。此外,我希望确保在填写NaN值之前公司匹配。在这种情况下,使用正向填充将导致公司“Pear”中填充来自公司“Banana”的数据 追加=追加。排序值(按=['Company','Intro'],不按位置='last') 追加=追加。重置索引(drop=True) 对于附加的.index中的i: 如果i==0: 通过 其他: 如果在[i,'Company']处追加,则==
追加=追加。排序值(按=['Company','Intro'],不按位置='last')
追加=追加。重置索引(drop=True)
对于附加的.index中的i:
如果i==0:
通过
其他:
如果在[i,'Company']处追加,则==在[i-1,'Company']处追加:
追加.fillna(method='ffill',inplace=True)
其他:
通过
附加数据帧
Company Intro Categories Headquarters Founded Date Funding Stage
Apple xyz Healthcare, Big Data New York 2018 Series A
Apple NaN NaN NaN NaN NaN
Apple NaN NaN NaN NaN NaN
Banana Lier Government Europe 2010 Series B
Pear NaN NaN NaN NaN NaN
这是我希望达到的预期结果:
Expected Result
Company Intro Categories Headquarters Founded Date Funding Stage
Apple xyz Healthcare, Big Data New York 2018 Series A
Apple xyz Healthcare, Big Data New York 2018 Series A
Apple xyz Healthcare, Big Data New York 2018 Series A
Banana Lier Government Europe 2010 Series B
Pear NaN NaN NaN NaN NaN
配合使用
NaaN只是NaN的错别字还是别的什么?oO@meissner_对不起,这是NaN的错别字。
df.groupby(['Company']).ffill()
Company Intro Categories Headquarters Founded Date Funding Stage
0 Apple xyz Healthcare, Big Data New York 2018.0 Series A
1 Apple xyz Healthcare, Big Data New York 2018.0 Series A
2 Apple xyz Healthcare, Big Data New York 2018.0 Series A
3 Banana Lier Government Europe 2010.0 Series B
4 Pear NaN NaN NaN NaN NaN
import pandas as pd
from io import StringIO
# sample data
df = pd.read_fwf(StringIO("""
Company Intro Categories Headquarters Founded_Date Funding_Stage
Apple xyz Healthcare, Big Data New York 2018 Series A
Apple NaN NaN NaN NaN NaN
Apple NaN NaN NaN NaN NaN
Banana Lier Government Europe 2010 Series B
Pear NaN NaN NaN NaN NaN"""), header=1)
# Create the summary level - assumes repeat data comes first
df_summary = df.groupby("Company").head(1)
# Join the result
df_result = df[['Company']].merge(df_summary, on="Company")
# Company Intro Categories Headquarters Founded_Date Funding_Stage
#0 Apple xyz Healthcare, Big Data New York 2018.0 Series A
#1 Apple xyz Healthcare, Big Data New York 2018.0 Series A
#2 Apple xyz Healthcare, Big Data New York 2018.0 Series A
#3 Banana Lier Government Europe 2010.0 Series B
#4 Pear NaN NaN NaN NaN NaN