Python 熊猫:在熊猫数据框中填写缺失的日期
如何填充“日期”列,以便在检测到日期时将该日期添加到下面的行中,直到它看到新日期开始添加该日期 可复制示例: 输入:Python 熊猫:在熊猫数据框中填写缺失的日期,python,pandas,datetime,strftime,Python,Pandas,Datetime,Strftime,如何填充“日期”列,以便在检测到日期时将该日期添加到下面的行中,直到它看到新日期开始添加该日期 可复制示例: 输入: Date Headline 0 Mar-20-21 04:03AM Apple CEO Cook, executives on tentative list o... 1 03:43AM Apple CEO Cook, execs
Date Headline
0 Mar-20-21 04:03AM Apple CEO Cook, executives on tentative list o...
1 03:43AM Apple CEO Cook, execs on tentative list of wit...
2 Mar-19-21 10:19PM Dow Jones Futures: Why This Market Rally Is So...
3 06:13PM Zuckerberg: Apples Privacy Move Could Spur Mor...
4 05:45PM Apple (AAPL) Dips More Than Broader Markets: W...
5 04:17PM Facebook Stock Jumps As Zuckerberg Changes Tun...
6 04:03PM Best Dow Jones Stocks To Buy And Watch In Marc...
7 01:02PM The Nasdaq's on the Rise Friday, and These 2 S...
期望输出:
Date Headline
0 Mar-20-21 04:03AM Apple CEO Cook, executives on tentative list o...
1 Mar-20-21 03:43AM Apple CEO Cook, execs on tentative list of wit...
2 Mar-19-21 10:19PM Dow Jones Futures: Why This Market Rally Is So...
3 Mar-19-21 06:13PM Zuckerberg: Apples Privacy Move Could Spur Mor...
4 Mar-19-21 05:45PM Apple (AAPL) Dips More Than Broader Markets: W...
5 Mar-19-21 04:17PM Facebook Stock Jumps As Zuckerberg Changes Tun...
6 Mar-19-21 04:03PM Best Dow Jones Stocks To Buy And Watch In Marc...
7 Mar-19-21 01:02PM The Nasdaq's on the Rise Friday, and These 2 S...
尝试:
df['Time'] = [x[-7:] for x in df['Date']]
df['Date'] = [x[:-7] for x in df['Date']]
# Some code that fills the date
# Then convert to datetime
在使用ffill()
之前,您需要拆分两列以获得正确的时间,并且只填写日期部分。要使用ffill()
,您需要将空格替换为np.nan
。然后将这些列放回一起,并将该操作包装在pd.to_datetime
中,以获得正确的dtype
最后,您可以删除时间列
# Imports
import numpy as np
import pandas as pd
# Split the column
df[['Date','Time']] = df['Date'].str.split(' ',expand=True)
# Replace space with nan and use ffill()
df['Date'] = df['Date'].replace(r'^\s*$', np.nan, regex=True).ffill()
# Put the columns back and convert to datetime
df['Date'] = pd.to_datetime(df['Date'] + ' ' + df['Time'])
# Drop the time column
del(df['Time'])
你会回来的:
df
Date Headline
0 2021-03-20 04:03:00 Apple CEO Cook, executives on tentative list o...
1 2021-03-20 03:43:00 Apple CEO Cook, execs on tentative list of wit...
2 2021-03-19 22:19:00 Dow Jones Futures: Why This Market Rally Is So...
3 2021-03-19 18:13:00 Zuckerberg: Apples Privacy Move Could Spur Mor...
4 2021-03-19 17:45:00 Apple (AAPL) Dips More Than Broader Markets: W...
5 2021-03-19 16:17:00 Facebook Stock Jumps As Zuckerberg Changes Tun...
6 2021-03-19 16:03:00 Best Dow Jones Stocks To Buy And Watch In Marc...
7 2021-03-19 13:02:00 The Nasdaq's on the Rise Friday, and These 2 S...
编辑
如果您希望您的“日期”与您期望的结果完全一致,即此格式的“Mar-20-21”,请不要将其包装在pd.to_datetime()
中,并将其作为对象保存:
df['Date'] = df['Date'] + ' ' + df['Time']
df
Date Headline
0 Mar-20-21 04:03AM Apple CEO Cook, executives on tentative list o...
1 Mar-20-21 03:43AM Apple CEO Cook, execs on tentative list of wit...
2 Mar-19-21 10:19PM Dow Jones Futures: Why This Market Rally Is So...
3 Mar-19-21 06:13PM Zuckerberg: Apples Privacy Move Could Spur Mor...
4 Mar-19-21 05:45PM Apple (AAPL) Dips More Than Broader Markets: W...
5 Mar-19-21 04:17PM Facebook Stock Jumps As Zuckerberg Changes Tun...
6 Mar-19-21 04:03PM Best Dow Jones Stocks To Buy And Watch In Marc...
7 Mar-19-21 01:02PM The Nasdaq's on the Rise Friday, and These 2 S...
解决方案在没有以下行的情况下工作:df[['Date','Time']=df['Date'].str.split('',expand=True)
。这句台词让我觉得纳特