如何提取与特定列中的日期相同的值?(python语言)
考虑以下字典中的数据帧。其中两列是“datetime”、“date\u at\u,需要哪个值”。我想创建一个新列,其中包含datetime列的值,作为列表/系列,这些列表/系列的日期与“date_at_which_value_is_needed”列中的值相同。有没有办法不用循环就能做到这一点如何提取与特定列中的日期相同的值?(python语言),python,pandas,Python,Pandas,考虑以下字典中的数据帧。其中两列是“datetime”、“date\u at\u,需要哪个值”。我想创建一个新列,其中包含datetime列的值,作为列表/系列,这些列表/系列的日期与“date_at_which_value_is_needed”列中的值相同。有没有办法不用循环就能做到这一点 {'datetime': {667: Timestamp('2019-11-08 10:00:00+0000', tz='UTC'), 673: Timestamp('2019-11-08
{'datetime': {667: Timestamp('2019-11-08 10:00:00+0000', tz='UTC'),
673: Timestamp('2019-11-08 16:00:00+0000', tz='UTC'),
679: Timestamp('2019-11-08 22:00:00+0000', tz='UTC'),
685: Timestamp('2019-11-09 04:00:00+0000', tz='UTC'),
691: Timestamp('2019-11-11 10:00:00+0000', tz='UTC'),
697: Timestamp('2019-11-11 16:00:00+0000', tz='UTC'),
703: Timestamp('2019-11-11 22:00:00+0000', tz='UTC'),
709: Timestamp('2019-11-12 04:00:00+0000', tz='UTC'),
715: Timestamp('2019-11-12 10:00:00+0000', tz='UTC'),
721: Timestamp('2019-11-12 16:00:00+0000', tz='UTC'),
727: Timestamp('2019-11-12 22:00:00+0000', tz='UTC'),
733: Timestamp('2019-11-13 04:00:00+0000', tz='UTC'),
739: Timestamp('2019-11-13 10:00:00+0000', tz='UTC'),
745: Timestamp('2019-11-13 16:00:00+0000', tz='UTC'),
751: Timestamp('2019-11-13 22:00:00+0000', tz='UTC'),
757: Timestamp('2019-11-14 04:00:00+0000', tz='UTC'),
763: Timestamp('2019-11-14 10:00:00+0000', tz='UTC'),
769: Timestamp('2019-11-14 16:00:00+0000', tz='UTC'),
775: Timestamp('2019-11-14 22:00:00+0000', tz='UTC'),
780: Timestamp('2019-11-15 04:00:00+0000', tz='UTC')},
'date_at_which_value_is_needed': {667: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
673: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
679: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
685: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
691: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
697: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
703: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
709: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
715: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
721: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
727: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
733: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
739: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
745: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
751: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
757: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
763: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
769: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
775: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
780: Timestamp('2019-11-12 00:00:00+0000', tz='UTC')},
'c': {667: 64.6475,
673: 65.005,
679: 65.0075,
685: 65.0075,
691: 65.0225,
697: 65.5875,
703: 65.6,
709: 65.5625,
715: 65.355,
721: 65.475,
727: 65.425,
733: 65.0375,
739: 65.9017,
745: 66.1875,
751: 66.15,
757: 66.075,
763: 65.695,
769: 65.625,
775: 65.66,
780: 65.9525}}
例如,对于最后一行(索引780),新列将包含以下列表:
[Timestamp('2019-11-12 04:00:00+0000', tz='UTC'), Timestamp('2019-11-12 10:00:00+0000', tz='UTC'), Timestamp('2019-11-12 16:00:00+0000', tz='UTC'), Timestamp('2019-11-12 22:00:00+0000', tz='UTC')]
试试这个:
import pandas as pd
from pandas import Timestamp
data = {'datetime': {667: Timestamp('2019-11-08 10:00:00+0000', tz='UTC'),
673: Timestamp('2019-11-08 16:00:00+0000', tz='UTC'),
679: Timestamp('2019-11-08 22:00:00+0000', tz='UTC'),
685: Timestamp('2019-11-09 04:00:00+0000', tz='UTC'),
691: Timestamp('2019-11-11 10:00:00+0000', tz='UTC'),
697: Timestamp('2019-11-11 16:00:00+0000', tz='UTC'),
703: Timestamp('2019-11-11 22:00:00+0000', tz='UTC'),
709: Timestamp('2019-11-12 04:00:00+0000', tz='UTC'),
715: Timestamp('2019-11-12 10:00:00+0000', tz='UTC'),
721: Timestamp('2019-11-12 16:00:00+0000', tz='UTC'),
727: Timestamp('2019-11-12 22:00:00+0000', tz='UTC'),
733: Timestamp('2019-11-13 04:00:00+0000', tz='UTC'),
739: Timestamp('2019-11-13 10:00:00+0000', tz='UTC'),
745: Timestamp('2019-11-13 16:00:00+0000', tz='UTC'),
751: Timestamp('2019-11-13 22:00:00+0000', tz='UTC'),
757: Timestamp('2019-11-14 04:00:00+0000', tz='UTC'),
763: Timestamp('2019-11-14 10:00:00+0000', tz='UTC'),
769: Timestamp('2019-11-14 16:00:00+0000', tz='UTC'),
775: Timestamp('2019-11-14 22:00:00+0000', tz='UTC'),
780: Timestamp('2019-11-15 04:00:00+0000', tz='UTC')},
'date_at_which_value_is_needed': {667: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
673: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
679: Timestamp('2019-11-05 00:00:00+0000', tz='UTC'),
685: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
691: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
697: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
703: Timestamp('2019-11-06 00:00:00+0000', tz='UTC'),
709: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
715: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
721: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
727: Timestamp('2019-11-07 00:00:00+0000', tz='UTC'),
733: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
739: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
745: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
751: Timestamp('2019-11-08 00:00:00+0000', tz='UTC'),
757: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
763: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
769: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
775: Timestamp('2019-11-11 00:00:00+0000', tz='UTC'),
780: Timestamp('2019-11-12 00:00:00+0000', tz='UTC')},
'c': {667: 64.6475,
673: 65.005,
679: 65.0075,
685: 65.0075,
691: 65.0225,
697: 65.5875,
703: 65.6,
709: 65.5625,
715: 65.355,
721: 65.475,
727: 65.425,
733: 65.0375,
739: 65.9017,
745: 66.1875,
751: 66.15,
757: 66.075,
763: 65.695,
769: 65.625,
775: 65.66,
780: 65.9525}}
# Converting the dictionaries into a dataframe
datesDf = pd.DataFrame.from_dict(data)
# Selecting the date part of the datetime column
datesDf['date'] = datesDf['datetime'].apply(lambda x: x.date())
datesDf['date_needed'] = datesDf['date_at_which_value_is_needed'].apply(lambda x: x.date())
# Creating a new dataframe grouping dates by datetime
datesGrouped = datesDf.groupby('date')['datetime'].apply(list).to_frame()
# Joining original dataframe with new one after the grouping
result = datesDf.merge(datesGrouped, how='left', left_on='date_needed', right_on='date')
# Formating the result
result = result.drop(['date', 'date_needed'], axis = 1).rename(columns={"datetime_x": "datetime", "datetime_y": "datetime_col"})
这回答了你的问题吗?你能举一两行的例子吗?不太清楚你希望在新的世界里看到什么column@IgnacioAlorre例如,在新列中,最后一行应该是列表/系列:[时间戳('2019-11-12 04:00:00+0000',tz='UTC')、时间戳('2019-11-12 10:00:00+0000',tz='UTC')、时间戳('2019-11-12 16:00:00+0000',tz='UTC')、时间戳('2019-11-12 22:00:00+0000',tz='UTC')]@sushanth你能告诉我应该使用的命令吗?这个问题有多个答案,我尝试了其中一个,但结果出乎意料