Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如果在组合数据帧中找到上一条记录,则填写NAN_Python_Python 3.x_Pandas_Dataframe - Fatal编程技术网

Python 如果在组合数据帧中找到上一条记录,则填写NAN

Python 如果在组合数据帧中找到上一条记录,则填写NAN,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,假设我有以下数据帧: Country Client_id Order_date 0 USA aaa1 1/1/2020 1 CA bbb2 2/2/2020 2 JP ccc3 2/2/2020 3 USA aaa1 3/10/2020 4 NaN aaa1 1/9/2020 5 NaN

假设我有以下数据帧:

  Country   Client_id     Order_date
0   USA          aaa1       1/1/2020
1   CA           bbb2       2/2/2020
2   JP           ccc3       2/2/2020
3   USA          aaa1       3/10/2020
4   NaN          aaa1       1/9/2020
5   NaN          bbb2       20/5/2021
6   NaN          ccc3       20/5/2021
7   NaN          ccc3       20/5/2021
我在国家/地区列中有许多NaN,但前面第一行中的每个客户机id都分配给一个国家/地区。因此,我需要将NaN行的每个客户机id与前几行匹配,如果找到的值用正确的国家/地区替换了NaN

预期产出:

  Country   Client_id     Order_date
0   USA          aaa1       1/1/2020
1   CA           bbb2       2/2/2020
2   JP           ccc3       2/2/2020
3   USA          aaa1       3/10/2020
4   USA          aaa1       1/9/2020
5   CA           bbb2       20/5/2021
6   JP           ccc3       20/5/2021
7   JP           ccc3       20/5/2021
到目前为止,我所做的是按照客户ID进行排序,以便对其进行安排,然后填写国家/地区:

df.sort_values(df['Client_id']).groupby('Country').ffill()
但这对我不起作用

让我们试试看

df.Country.fillna(df.groupby('Client_id')['Country'].transform('first'),inplace=True)
df
  Country Client_id Order_date
0     USA      aaa1   1/1/2020
1      CA      bbb2   2/2/2020
2      JP      ccc3   2/2/2020
3     USA      aaa1  3/10/2020
4     USA      aaa1   1/9/2020
5      CA      bbb2  20/5/2021
6      JP      ccc3  20/5/2021
7      JP      ccc3  20/5/2021
让我们试试

df.Country.fillna(df.groupby('Client_id')['Country'].transform('first'),inplace=True)
df
  Country Client_id Order_date
0     USA      aaa1   1/1/2020
1      CA      bbb2   2/2/2020
2      JP      ccc3   2/2/2020
3     USA      aaa1  3/10/2020
4     USA      aaa1   1/9/2020
5      CA      bbb2  20/5/2021
6      JP      ccc3  20/5/2021
7      JP      ccc3  20/5/2021