Python 填写列日期值,直到达到另一个日期值,然后继续填写新达到的值
我有以下数据帧:Python 填写列日期值,直到达到另一个日期值,然后继续填写新达到的值,python,pandas,dataframe,autofill,Python,Pandas,Dataframe,Autofill,我有以下数据帧: Date Team 1 Team 2 Score1 Score2 0 1-Oct-17 1 NaN 2 NaN 1 21:20 Chicago Cubs Cincinnati Reds 1 3.0 2 21:15
Date Team 1 Team 2 Score1 Score2
0 1-Oct-17 1 NaN 2 NaN
1 21:20 Chicago Cubs Cincinnati Reds 1 3.0
2 21:15 Kansas City Royals Arizona Diamondbacks 2 14.0
3 21:15 St.Louis Cardinals Milwaukee Brewers 1 6.0
4 30-Sep-17 1 NaN 2 NaN
5 22:15 St.Louis Cardinals Milwaukee Brewers 7 6.0
6 22:05 Chicago Cubs Cincinnati Reds 9 0.0
7 22:05 San Francisco Giants San Diego Padres 2 3.0
8 19:05 Boston Red Sox Houston Astros 6 3.0
9 29-Sep-17 1 NaN 2 NaN
10 20:20 Chicago Cubs Cincinnati Reds 5 4.0
11 19:05 New York Yankees Toronto Blue Jays 4 0.0
12 2:15 Kansas City Royals Detroit Tigers 1 4.0
13 2:10 Chicago White Sox Los Angeles Angels 5 4.0
为了得到这个结果,我需要填写日期值并替换时间值
Date Team 1 Team 2 Score1 Score2
0 1-Oct-17 1 NaN 2 NaN
1 1-Oct-17 Chicago Cubs Cincinnati Reds 1 3.0
2 1-Oct-17 Kansas City Royals Arizona Diamondbacks 2 14.0
3 1-Oct-17 St.Louis Cardinals Milwaukee Brewers 1 6.0
4 30-Sep-17 1 NaN 2 NaN
5 30-Sep-17 St.Louis Cardinals Milwaukee Brewers 7 6.0
6 30-Sep-17 Chicago Cubs Cincinnati Reds 9 0.0
7 30-Sep-17 San Francisco Giants San Diego Padres 2 3.0
8 30-Sep-17 Boston Red Sox Houston Astros 6 3.0
9 29-Sep-17 1 NaN 2 NaN
10 29-Sep-17 Chicago Cubs Cincinnati Reds 5 4.0
11 29-Sep-17 New York Yankees Toronto Blue Jays 4 0.0
12 29-Sep-17 Kansas City Royals Detroit Tigers 1 4.0
13 29-Sep-17 Chicago White Sox Los Angeles Angels 5 4.0
您可以检查列
Date
中值的长度,如果更高的值为7
替换为NaN
by,则最后一次向前填充缺少的值为ffill
(使用方法ffill
):
另一个想法是将值转换为日期时间,并比较0:00
时间:
from datetime import time
df['Date'] = pd.to_datetime(df['Date'] )
df['Date'] = df['Date'].where(df['Date'].dt.time == time(0,0)).ffill()
print (df)
Date Team 1 Team 2 Score1 Score2
0 2017-10-01 1 NaN 2 NaN
1 2017-10-01 Chicago Cubs Cincinnati Reds 1 3.0
2 2017-10-01 Kansas City Royals Arizona Diamondbacks 2 14.0
3 2017-10-01 St.Louis Cardinals Milwaukee Brewers 1 6.0
4 2017-09-30 1 NaN 2 NaN
5 2017-09-30 St.Louis Cardinals Milwaukee Brewers 7 6.0
6 2017-09-30 Chicago Cubs Cincinnati Reds 9 0.0
7 2017-09-30 San Francisco Giants San Diego Padres 2 3.0
8 2017-09-30 Boston Red Sox Houston Astros 6 3.0
9 2017-09-29 1 NaN 2 NaN
10 2017-09-29 Chicago Cubs Cincinnati Reds 5 4.0
11 2017-09-29 New York Yankees Toronto Blue Jays 4 0.0
12 2017-09-29 Kansas City Royals Detroit Tigers 1 4.0
13 2017-09-29 Chicago White Sox Los Angeles Angels 5 4.0
from datetime import time
df['Date'] = pd.to_datetime(df['Date'] )
df['Date'] = df['Date'].where(df['Date'].dt.time == time(0,0)).ffill()
print (df)
Date Team 1 Team 2 Score1 Score2
0 2017-10-01 1 NaN 2 NaN
1 2017-10-01 Chicago Cubs Cincinnati Reds 1 3.0
2 2017-10-01 Kansas City Royals Arizona Diamondbacks 2 14.0
3 2017-10-01 St.Louis Cardinals Milwaukee Brewers 1 6.0
4 2017-09-30 1 NaN 2 NaN
5 2017-09-30 St.Louis Cardinals Milwaukee Brewers 7 6.0
6 2017-09-30 Chicago Cubs Cincinnati Reds 9 0.0
7 2017-09-30 San Francisco Giants San Diego Padres 2 3.0
8 2017-09-30 Boston Red Sox Houston Astros 6 3.0
9 2017-09-29 1 NaN 2 NaN
10 2017-09-29 Chicago Cubs Cincinnati Reds 5 4.0
11 2017-09-29 New York Yankees Toronto Blue Jays 4 0.0
12 2017-09-29 Kansas City Royals Detroit Tigers 1 4.0
13 2017-09-29 Chicago White Sox Los Angeles Angels 5 4.0