Python 从旧数据帧创建子列
我正在尝试创建一个包含子列的数据帧,其中子列是数据帧 数据:Python 从旧数据帧创建子列,python,pandas,dataframe,Python,Pandas,Dataframe,我正在尝试创建一个包含子列的数据帧,其中子列是数据帧 数据: 将熊猫作为pd导入 从导入时间戳 df1={'Open':{Timestamp('2020-12-15 01:05:00'):152.28,Timestamp('2020-12-15 01:10:00'):151.59,Timestamp('2020-12-15 01:15:00'):152.19,'High':{Timestamp('2020-12-15 01:05:00'):152.28,Timestamp('2020-12-15
将熊猫作为pd导入
从导入时间戳
df1={'Open':{Timestamp('2020-12-15 01:05:00'):152.28,Timestamp('2020-12-15 01:10:00'):151.59,Timestamp('2020-12-15 01:15:00'):152.19,'High':{Timestamp('2020-12-15 01:05:00'):152.28,Timestamp('2020-12-15 01:10:00'):152.39,Timestamp('2020-12-15 01:15:15:15:00'):152.38,'Low','Timestamp('2020-12-12-15:01:05:00'):150:00'):150:00')('2020-12-15 01:10:00'):151.34,Timestamp('2020-12-15 01:15:00'):150.67},'Close':{Timestamp('2020-12-15 01:05:00'):151.58,Timestamp('2020-12-15 01:10:00'):152.21,Timestamp('2020-12-15 01:15:00'):151.12},'price':{Timestamp('2020-12-15 01:05:00'):149.305,Timestamp('2020-12-15:01:10:00'):南,Timestamp('2020-12-15:01:00'):南),执行数量:{Timestamp('2020-12-15 01:05:00'):6.991142857142856,Timestamp('2020-12-15 01:10:00'):nan,Timestamp('2020-12-15 01:15:00'):nan},'side':{Timestamp('2020-12-15 01:05:00'):1.0,Timestamp('2020-12-15 01:10:00'):nan,Timestamp('2020-12-15 01:15:00'):nan}
df2={'Open':{Timestamp('2020-12-15 01:05:00'):5.385,Timestamp('2020-12-15 01:10:00'):5.403,Timestamp('2020-12-15 01:15:00'):5.419,'High':{Timestamp('2020-12-15 01:05:00'):5.4179999999999,Timestamp('2020-12-15 01:10:00'):5.4289999999999,Timestamp('2020-12-15:15:15:15:00'):5.42,'Low:01:00')):5.38399999999995,时间戳('2020-12-15 01:10:00'):5.395,时间戳('2020-12-15 01:15:00'):5.351},'Close':{时间戳('2020-12-15 01:05:00'):5.40600000000001,时间戳('2020-12-15 01:10:00'):5.414,时间戳('2020-12-15 01:15:15:00'):5.37},'price':{时间戳('2020-12-15 01:05:05:00'):南,时间戳('2020-12-15:00'):南,南('2020-12-15 01:15:00'):nan},'executedQuantity':{Timestamp('2020-12-15 01:05:00'):nan,Timestamp('2020-12-15 01:10:00'):nan},'side':{Timestamp('2020-12-15 01:05:00'):nan,Timestamp('2020-12-15 01:10:00'):nan,Timestamp('2020-12-15 01:10:00'):nan
df3={'Open':{Timestamp('2020-12-15 01:05:00'):12.455,Timestamp('2020-12-15 01:10:00'):12.429,Timestamp('2020-12-15 01:15:00'):12.442},'High':{Timestamp('2020-12-15 01:05:00'):12.458,Timestamp('2020-12-12-15 01:10:10:00'):12.456,Timestamp('2020-12-12-15-15 01:15:15:15:15:15:15:15:00'):12.443,'High'),'2020-12-12-12-12-12-12-15:999901:00'):Timestamp('999900'):'('2020-12-15 01:10:00'):12.425,Timestamp('2020-12-15 01:15:00'):12.383,'Close':{Timestamp('2020-12-15 01:05:00'):12.435,Timestamp('2020-12-15 01:10:00'):12.442,Timestamp('2020-12-15 01:15:00'):12.401,'price':{Timestamp('2020-12-15 01:05:00'):nan,Timestamp('2020-12-15:01:00'):nan,executedQuantity:{Timestamp('2020-12-15 01:05:00'):nan,Timestamp('2020-12-15 01:10:00'):nan,Timestamp('2020-12-15 01:15:00'):nan},'side':{Timestamp('2020-12-15 01:05:00'):nan,Timestamp('2020-12-15 01:10:00'):nan,Timestamp('2020-12-15 01:15:15:00'):nan}
df1=局部数据帧(df1)
df2=局部数据帧(df2)
df3=局部数据帧(df3)
这样的输出是预期的,但是我希望它们共享时间戳索引,所以一行将包含一个索引下所有数据帧的所有数据
df3=pd.DataFrame()
dfList=[df1、df2、df3]
对于dfList中的df:
cols=pd.多索引从_帧开始(df、['Open'、'High'、'Low'、'Close'、'price'、'executedQty'、'side')
df=pd.DataFrame(df,columns=cols)
df3=df3.join(df)
打印(df3)
df1
开盘价高低收盘价执行数量\
2020-12-15 01:05:00 152.28 152.28 150.00 151.58 149.305 6.991143
2020-12-15 01:10:00 151.59 152.39 151.34 152.21楠楠楠
2020-12-15 01:15:00 152.19 152.38 150.67 151.12楠楠楠
一边
2020-12-15 01:05:00 1.0
2020-12-15 01:10:00南
2020-12-15 01:15:00南
df2
开盘价高低收盘价执行数量方
2020-12-15 01:05:00 5.385 5.418 5.384 5.406楠楠楠楠楠
2020-12-15 01:10:00 5.403 5.429 5.395 5.414楠楠楠楠楠
2020-12-15 01:15:00 5.419 5.420 5.351 5.370楠楠楠楠楠楠楠楠
df3
开盘价高低收盘价执行数量方
2020-12-15 01:05:00 12.45512.45812.42612.435楠楠楠楠楠
2020-12-15 01:10:00 12.429 12.456 12.425 12.442楠楠楠楠楠楠
2020-12-15 01:15:00 12.442 12.443 12.383 12.401楠楠楠楠楠楠楠楠
我还希望在循环中使用这个函数,在循环中创建其他数据帧。这是因为有很多超过3个数据帧,它们是通过请求中的数据创建的,否则我必须在concat之前命名每个数据帧,这对于我的用例来说太多了
像这样的
dfList=[df1,df2,df3]
dataFrame=pd.DataFrame
for d in dfList:
df=requestFuncThatCreatesDf(d)
dataFrame=dataFrame.concat([df],key=(d))
您可以使用带有可选键的参数沿axis=1
对数据帧进行压缩,以使生成的帧共享相同的时间戳
索引,并具有多索引
列:
pd.concat([df1, df2, df3], axis=1, keys=('df1', 'df2', 'df3'))
编辑(如果要动态生成与dfList
中的数据帧顺序对应的键):
结果:
df1 df2 df3
Open High Low Close price executedQty side Open High Low Close price executedQty side Open High Low Close price executedQty side
2020-12-15 01:05:00 152.28 152.28 150.00 151.58 149.305 6.991143 1.0 5.385 5.418 5.384 5.406 NaN NaN NaN 12.455 12.458 12.426 12.435 NaN NaN NaN
2020-12-15 01:10:00 151.59 152.39 151.34 152.21 NaN NaN NaN 5.403 5.429 5.395 5.414 NaN NaN NaN 12.429 12.456 12.425 12.442 NaN NaN NaN
2020-12-15 01:15:00 152.19 152.38 150.67 151.12 NaN NaN NaN 5.419 5.420 5.351 5.370 NaN NaN NaN 12.442 12.443 12.383 12.401 NaN NaN NaN
太棒了!你会如何在一个循环中做到这一点?我更新了我的问题,补充了一些关于this@normal_human检查编辑。是,它们是字符串名称,与requestfunchatthacreatesdf()的名称相同
使用。感谢您的帮助,它似乎工作得很好!@normal\u human Happy coding!@normal\u human Check用于df.columns.levels[0]:打印(df[c])
df1 df2 df3
Open High Low Close price executedQty side Open High Low Close price executedQty side Open High Low Close price executedQty side
2020-12-15 01:05:00 152.28 152.28 150.00 151.58 149.305 6.991143 1.0 5.385 5.418 5.384 5.406 NaN NaN NaN 12.455 12.458 12.426 12.435 NaN NaN NaN
2020-12-15 01:10:00 151.59 152.39 151.34 152.21 NaN NaN NaN 5.403 5.429 5.395 5.414 NaN NaN NaN 12.429 12.456 12.425 12.442 NaN NaN NaN
2020-12-15 01:15:00 152.19 152.38 150.67 151.12 NaN NaN NaN 5.419 5.420 5.351 5.370 NaN NaN NaN 12.442 12.443 12.383 12.401 NaN NaN NaN