Python 如何更新列值忽略(如果存在)
我有一个熊猫数据框和一个我想更新熊猫列的列表 如果值已经存在,则使用该列表,然后忽略该行Python 如何更新列值忽略(如果存在),python,python-3.x,pandas,append,Python,Python 3.x,Pandas,Append,我有一个熊猫数据框和一个我想更新熊猫列的列表 如果值已经存在,则使用该列表,然后忽略该行 (e.x) 代码: my_list = ["2018-11-01 00:00:02", "2018-11-01 00:00:07", "2018-11-01 00:00:12", "2018-11-01 00:00:17", "2018-11-01 00:00:22", "2018-11-01 00:00:27", "2018-11-01 00:00:32", "2018-11-01 00:00:37",
(e.x)
代码:
my_list = ["2018-11-01 00:00:02", "2018-11-01 00:00:07", "2018-11-01 00:00:12", "2018-11-01 00:00:17", "2018-11-01 00:00:22", "2018-11-01 00:00:27", "2018-11-01 00:00:32", "2018-11-01 00:00:37", "2018-11-01 00:00:42", "2018-11-01 00:00:47"]
df["date_time"] = pd.Series(my_list).astype(str)
当我执行上述代码时,它会产生以下输出:
如果
date\u time
列是从list
创建的datetimes
,则创建DatetimeIndex
并使用:
或者创建帮助程序DataFrame
并与left join一起使用:
df['date_time'] = pd.to_datetime(df['date_time'])
df = pd.DataFrame({'date_time': pd.to_datetime(my_list)}).merge(df, how='left')
print (df)
date_time value
0 2018-11-01 00:00:02 100.0
1 2018-11-01 00:00:07 NaN
2 2018-11-01 00:00:12 150.0
3 2018-11-01 00:00:17 NaN
4 2018-11-01 00:00:22 56.0
5 2018-11-01 00:00:27 NaN
6 2018-11-01 00:00:32 95.0
7 2018-11-01 00:00:37 NaN
8 2018-11-01 00:00:42 700.0
9 2018-11-01 00:00:47 NaN
如果DatetimeIndex:
df.index = pd.to_datetime(df.index)
df = df.reindex(pd.to_datetime(my_list).rename('date_time'))
print (df)
value
date_time
2018-11-01 00:00:02 100.0
2018-11-01 00:00:07 NaN
2018-11-01 00:00:12 150.0
2018-11-01 00:00:17 NaN
2018-11-01 00:00:22 56.0
2018-11-01 00:00:27 NaN
2018-11-01 00:00:32 95.0
2018-11-01 00:00:37 NaN
2018-11-01 00:00:42 700.0
2018-11-01 00:00:47 NaN
或:
当我使用这段代码时,它会生成``float()参数必须是字符串或数字,而不是'NaTType',``并且``您正在尝试合并datetime64[ns]和object列。如果您希望继续,您应该使用pd.concat``@arunkumar-您是否忘记了使用
df['date\u time']=pd.to\u datetime(df['date\u time'])
?因为错误意味着列/索引或列表未转换为日期时间。
df['date_time'] = pd.to_datetime(df['date_time'])
df = (df.set_index('date_time')
.reindex(pd.to_datetime(my_list)
.rename('date_time'))
.reset_index())
print (df)
date_time value
0 2018-11-01 00:00:02 100.0
1 2018-11-01 00:00:07 NaN
2 2018-11-01 00:00:12 150.0
3 2018-11-01 00:00:17 NaN
4 2018-11-01 00:00:22 56.0
5 2018-11-01 00:00:27 NaN
6 2018-11-01 00:00:32 95.0
7 2018-11-01 00:00:37 NaN
8 2018-11-01 00:00:42 700.0
9 2018-11-01 00:00:47 NaN
df['date_time'] = pd.to_datetime(df['date_time'])
df = pd.DataFrame({'date_time': pd.to_datetime(my_list)}).merge(df, how='left')
print (df)
date_time value
0 2018-11-01 00:00:02 100.0
1 2018-11-01 00:00:07 NaN
2 2018-11-01 00:00:12 150.0
3 2018-11-01 00:00:17 NaN
4 2018-11-01 00:00:22 56.0
5 2018-11-01 00:00:27 NaN
6 2018-11-01 00:00:32 95.0
7 2018-11-01 00:00:37 NaN
8 2018-11-01 00:00:42 700.0
9 2018-11-01 00:00:47 NaN
df.index = pd.to_datetime(df.index)
df = df.reindex(pd.to_datetime(my_list).rename('date_time'))
print (df)
value
date_time
2018-11-01 00:00:02 100.0
2018-11-01 00:00:07 NaN
2018-11-01 00:00:12 150.0
2018-11-01 00:00:17 NaN
2018-11-01 00:00:22 56.0
2018-11-01 00:00:27 NaN
2018-11-01 00:00:32 95.0
2018-11-01 00:00:37 NaN
2018-11-01 00:00:42 700.0
2018-11-01 00:00:47 NaN
df.index = pd.to_datetime(df.index)
df = pd.DataFrame({'date_time': pd.to_datetime(my_list)}).join(df, on='date_time')