Python 熊猫时间序列不同的时间框架_Python_Pandas

Python 熊猫时间序列不同的时间框架

python pandas

Python 熊猫时间序列不同的时间框架,python,pandas,Python,Pandas,我有两个时间序列的数据帧，一个是日值（下面是df1），另一个是年值（下面是df2）。例如： df1 df2 Date Value Year Value 2002-01-01 3 2002 0.5 2002-01-02 3.5 2003 3.1 2002-01-03

我有两个时间序列的数据帧，一个是日值（下面是df1），另一个是年值（下面是df2）。例如：

df1                                  df2
Date           Value                 Year   Value
2002-01-01      3                    2002    0.5
2002-01-02      3.5                  2003    3.1
2002-01-03      3.3                  2004    2.7
...             ...                  ...     ...
2010-01-01      4.96                 2010    0.7
2010-01-02      4.98

我想执行如下操作：如果每日日期与年度日期在同一年，则将每日日期乘以年度值

例如，2002年的每个日值都会乘以标量0.5，2003年的每个日值都会乘以标量3.1，以此类推

有人对这类问题有任何经验吗？

我想你可以首先从

日期列、按年份列的df1
和df2
中找到，最后一个是：
计时：
In [1386]: %timeit a(df1,df2)
100 loops, best of 3: 10.9 ms per loop

In [1387]: %timeit b(df3,df4)
1 loops, best of 3: 4.11 s per loop

代码：
您可以通过按行应用将年份映射到系数的字典来实现这一点
以下是示例中介绍的数据帧：
mapping = df2.set_index('Year').to_dict()['Value']
mapping
{2002: 0.5,
 2003: 3.1000000000000001,
 2004: 2.7000000000000002,
 2010: 0.69999999999999996}

df1['Year'] = df1['Date'].dt.year
df1['Adjusted Value'] = df1.apply(lambda x: x['Value']*mapping[x['Year']], axis=1)
df1

          Date  Value   Year    Adjusted Value
0   2002-01-01  3.00    2002    1.500
1   2002-01-02  3.50    2002    1.750
2   2002-01-03  3.30    2002    1.650
3   2010-01-01  4.96    2010    3.472
4   2010-01-02  4.98    2010    3.486

我认为日期
不是索引
，而是列。在df2
中也包括Year。
#length(df1) = 50k
df1 = pd.concat([df1]*10000).reset_index(drop=True)

df3 = df1.copy()
df4 = df2.copy()

def a(df1,df2):
    df1['Year'] = df1.Date.dt.year
    df = pd.merge(df1,df2, on='Year',  suffixes=('', '_x') )
    #print df
    df['Multiple'] = df['Value'].mul(df['Value_x']) 
    return df.drop('Value_x', axis=1)

def b(df1,df2):
    mapping = df2.set_index('Year').to_dict()['Value']    
    df1['Year'] = df1['Date'].dt.year
    df1['Multiple'] = df1.apply(lambda x: x['Value']*mapping[x['Year']], axis=1)
    return df1

print a(df1,df2)    
print b(df3,df4)

mapping = df2.set_index('Year').to_dict()['Value']
mapping
{2002: 0.5,
 2003: 3.1000000000000001,
 2004: 2.7000000000000002,
 2010: 0.69999999999999996}

df1['Year'] = df1['Date'].dt.year
df1['Adjusted Value'] = df1.apply(lambda x: x['Value']*mapping[x['Year']], axis=1)
df1

          Date  Value   Year    Adjusted Value
0   2002-01-01  3.00    2002    1.500
1   2002-01-02  3.50    2002    1.750
2   2002-01-03  3.30    2002    1.650
3   2010-01-01  4.96    2010    3.472
4   2010-01-02  4.98    2010    3.486