提取子字符串并转换为datetime python
我有一个数据帧,提取子字符串并转换为datetime python,python,dataframe,datetime,Python,Dataframe,Datetime,我有一个数据帧,df: data = [{0: 18, 1: '(Responses) 17th Nov 20'}, {0: 304, 1: '(Responses) 17th Nov 20'}, {0: 1177, 1: '(Responses) 17th Nov 20'}, {0: 899, 1: '(Responses) 17th Nov 20'}] df = pd.DataFrame(data) 0 1
df
:
data = [{0: 18, 1: '(Responses) 17th Nov 20'},
{0: 304, 1: '(Responses) 17th Nov 20'},
{0: 1177, 1: '(Responses) 17th Nov 20'},
{0: 899, 1: '(Responses) 17th Nov 20'}]
df = pd.DataFrame(data)
0 1
18 (Responses) 17th Nov 20
304 (Responses) 17th Nov 20
1177 (Responses) 17th Nov 20
899 (Responses) 17th Nov 20
是否有有效的方法提取2020年11月17日的as17-11-2020
日期,并将其添加到新的列[2]
as17-11-2020
as日期
对于其他日期,它也可以是1st
或2nd
或3rd
预期产出:
0 1 2
18 (Responses) 17th Nov 20 17-11-2020
304 (Responses) 17th Nov 20 17-11-2020
1177 (Responses) 17th Nov 20 17-11-2020
899 (Responses) 17th Nov 20 17-11-2020
只需split
以“(responses)”作为关键字的字符串,然后获取split后的第二个元素:
df['new_column'] = df['1'].str.split("(responses)").str[1]
尝试使用str.split
和pd.to\u datetime
:
df[2] = pd.to_datetime(df[1].str.replace('\(Responses\) ', ''))
print(df)
输出:
0 1 2
0 18 (Responses) 17th Nov 20 2020-11-17
1 304 (Responses) 17th Nov 20 2020-11-17
2 1177 (Responses) 17th Nov 20 2020-11-17
3 899 (Responses) 17th Nov 20 2020-11-17
df[1].str.split(n=1,expand=True)
?如果我们能看到列是如何创建的,那么就有可能提供一个更优化的解决方案来避免这种尴尬的未清理数据。@LainTaljukn=1
它只拆分一次。@cs95更新了我的问题hi,请看我更新的问题,我正在获取Nan值instead@user6308605我编辑了我的答案,请再次检查我为什么得到Nan。实际字符串是BeAChampion(Responses)的副本,所以应该是str[4],对吗?但还是越来越Nan@user6308605我有一个新的编辑我的答案检查出来