Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/loops/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫-迭代年-周-日指数以实现快速性能的最佳方式_Python_Loops_Pandas_Dataframe - Fatal编程技术网

Python 熊猫-迭代年-周-日指数以实现快速性能的最佳方式

Python 熊猫-迭代年-周-日指数以实现快速性能的最佳方式,python,loops,pandas,dataframe,Python,Loops,Pandas,Dataframe,我有一个典型的财务数据框,包含“日期”、“时间”、“开放”、“高”、“低”、“关闭”、“平均值”和“量”列,频率为1分钟(120万行/df,500+df) 我需要一年一年、一年一周、一天一周地迭代这个数据帧 我今天所做的一些代码: import os import pandas as pd for file in os.listdir(data_path): if file.endswith('.csv'): df = pd.read_csv(data_pat

我有一个典型的财务数据框,包含“日期”、“时间”、“开放”、“高”、“低”、“关闭”、“平均值”和“量”列,频率为1分钟(120万行/df,500+df)

我需要一年一年、一年一周、一天一周地迭代这个数据帧

我今天所做的一些代码:

 import os
 import pandas as pd


 for file in os.listdir(data_path):
    if file.endswith('.csv'):

        df = pd.read_csv(data_path + file, parse_dates=[['Date', 'Time']])
        df.columns = ['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume']  # ranamed the Date_Time column
        df['Timestamp'] = pd.to_datetime(df['Timestamp'])
        df['Mean'] = round(df[['Open', 'High', 'Low', 'Close']].mean(axis=1), 2)

        df['Year'] = [0] * len(df)
        df['Week'] = [0] * len(df)
        df['Day'] = [0] * len(df)
        for i in range(len(df)):
            df['Year'][i] = df['Timestamp'][i].isocalendar()[0]
            df['Week'][i] = df['Timestamp'][i].isocalendar()[1]
            df['Day'][i] = df['Timestamp'][i].isocalendar()[2]

        index = pandas.MultiIndex.from_arrays([df['Year'], df['Week'], df['Day']], names = ['Year','Week','Day'])

        # build a new df from df with index as MultiIndex and save it in .hdf format
然后,我使用这个3级索引以以下方式访问带有3个for循环的数据:

# years cycle
years_array = asset.data.index.levels[0].values
for year in years_array:

    # weeks cycle
    weeks_array = np.array(np.unique(asset.data.loc[year].index.labels[0] + 1))
    for week in weeks_array:

        week0 = asset.data.loc[year, week].Open.values
        mean0 = np.mean(week0)

        if week != weeks_array[-1]:
            week1_year = year
            week1_week = week + 1
        elif (week == weeks_array[-1]) & (year != years_array[-1]):
            week1_year = year + 1
            week1_week = 1
        elif (week == weeks_array[-1]) & (year == years_array[-1]):
            break

        # minutes cycle
        week1 = asset.data.loc[week1_year, week1_week].Open.values
        for minute in range(len(week1)):
           # do da magic stuff...
有更聪明的方法吗?这是StackOverflow,所以我很确定有更聪明的方法

我真正需要的是根据我当前一周(代码中的第1周)的当前位置,轻松获取上一周(代码中的第0周)的数据

谢谢你的帮助