Python 熊猫-迭代年-周-日指数以实现快速性能的最佳方式
我有一个典型的财务数据框,包含“日期”、“时间”、“开放”、“高”、“低”、“关闭”、“平均值”和“量”列,频率为1分钟(120万行/df,500+df) 我需要一年一年、一年一周、一天一周地迭代这个数据帧 我今天所做的一些代码:Python 熊猫-迭代年-周-日指数以实现快速性能的最佳方式,python,loops,pandas,dataframe,Python,Loops,Pandas,Dataframe,我有一个典型的财务数据框,包含“日期”、“时间”、“开放”、“高”、“低”、“关闭”、“平均值”和“量”列,频率为1分钟(120万行/df,500+df) 我需要一年一年、一年一周、一天一周地迭代这个数据帧 我今天所做的一些代码: import os import pandas as pd for file in os.listdir(data_path): if file.endswith('.csv'): df = pd.read_csv(data_pat
import os
import pandas as pd
for file in os.listdir(data_path):
if file.endswith('.csv'):
df = pd.read_csv(data_path + file, parse_dates=[['Date', 'Time']])
df.columns = ['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume'] # ranamed the Date_Time column
df['Timestamp'] = pd.to_datetime(df['Timestamp'])
df['Mean'] = round(df[['Open', 'High', 'Low', 'Close']].mean(axis=1), 2)
df['Year'] = [0] * len(df)
df['Week'] = [0] * len(df)
df['Day'] = [0] * len(df)
for i in range(len(df)):
df['Year'][i] = df['Timestamp'][i].isocalendar()[0]
df['Week'][i] = df['Timestamp'][i].isocalendar()[1]
df['Day'][i] = df['Timestamp'][i].isocalendar()[2]
index = pandas.MultiIndex.from_arrays([df['Year'], df['Week'], df['Day']], names = ['Year','Week','Day'])
# build a new df from df with index as MultiIndex and save it in .hdf format
然后,我使用这个3级索引以以下方式访问带有3个for循环的数据:
# years cycle
years_array = asset.data.index.levels[0].values
for year in years_array:
# weeks cycle
weeks_array = np.array(np.unique(asset.data.loc[year].index.labels[0] + 1))
for week in weeks_array:
week0 = asset.data.loc[year, week].Open.values
mean0 = np.mean(week0)
if week != weeks_array[-1]:
week1_year = year
week1_week = week + 1
elif (week == weeks_array[-1]) & (year != years_array[-1]):
week1_year = year + 1
week1_week = 1
elif (week == weeks_array[-1]) & (year == years_array[-1]):
break
# minutes cycle
week1 = asset.data.loc[week1_year, week1_week].Open.values
for minute in range(len(week1)):
# do da magic stuff...
有更聪明的方法吗?这是StackOverflow,所以我很确定有更聪明的方法
我真正需要的是根据我当前一周(代码中的第1周)的当前位置,轻松获取上一周(代码中的第0周)的数据
谢谢你的帮助