Python 熊猫:使用groupby/lambda或函数计算加权平均价格?
我有一个数据框,其中4个唯一订单被分成第3-12行。正如您在下面的步骤1、2和3中所看到的,我使用groupby使其成为1 order=1行 然而,我遗漏了一个关键步骤,即计算每个订单的加权平均价格。目前,第2步是计算平均价格 我想做什么:Python 熊猫:使用groupby/lambda或函数计算加权平均价格?,python,pandas,lambda,group-by,weighted-average,Python,Pandas,Lambda,Group By,Weighted Average,我有一个数据框,其中4个唯一订单被分成第3-12行。正如您在下面的步骤1、2和3中所看到的,我使用groupby使其成为1 order=1行 然而,我遗漏了一个关键步骤,即计算每个订单的加权平均价格。目前,第2步是计算平均价格 我想做什么: | 1| Time | Market | Type | Price | Amount | Total | Fee | Acc | | 2|-----------|-----------|-------|---
| 1| Time | Market | Type | Price | Amount | Total | Fee | Acc |
| 2|-----------|-----------|-------|----------|---------|----------|----------|---------|
| 3| 22:12:15 | Market 1 | Buy | 660.33 | 0.0130 | 8.58429 | 0.00085 | MXG_33 |
| 4| 22:12:15 | Market 1 | Buy | 659.58 | 0.0070 | 4.61706 | 0.00055 | MXG_33 |
| 5| 19:36:08 | Market 1 | Sell | 670.00 | 0.0082 | 5.49400 | 0.00070 | MXG_33 |
| 6| 19:36:08 | Market 1 | Sell | 670.33 | 0.0058 | 3.88791 | 0.00048 | MXG_33 |
| 7| 19:36:08 | Market 1 | Sell | 671.23 | 0.0060 | 4.02738 | 0.00054 | MXG_33 |
| 8| 13:01:41 | Market 1 | Buy | 667.15 | 0.0015 | 1.00073 | 0.00011 | MXG_33 |
| 9| 13:01:41 | Market 1 | Buy | 667.10 | 0.0185 | 12.3414 | 0.00132 | MXG_33 |
|10| 07:14:36 | Market 1 | Sell | 657.55 | 0.0107 | 7.03579 | 0.00079 | MXG_33 |
|11| 07:14:36 | Market 1 | Sell | 657.08 | 0.0005 | 0.32854 | 0.00004 | MXG_33 |
|12| 07:14:36 | Market 1 | Sell | 656.59 | 0.0088 | 5.77799 | 0.00071 | MXG_33 |
d_agg = {'Market':'first'
,'Type':'first'
,'Price':'mean'
,'Amount':'sum'
,'Total':'sum'
,'Fee':'sum'
,'Acc':'first'}
(df.groupby('Time', sort=False)['Market','Type','Price','Amount','Total','Fee','Acc'].agg(d_agg).reset_index())
创建一个函数/lambda,用于计算每个订单的加权平均价格(可能基于groupby“Time”列)
- 订单1=第3行,第4行
- 订单2=第5、6、7行
- 订单3=第8行,第9行
- 顺序4=第10、11、10行
| 1| Time | Market | Type | Price | Amount | Total | Fee | Acc |
| 2|-----------|-----------|-------|----------|---------|----------|----------|---------|
| 3| 22:12:15 | Market 1 | Buy | 660.33 | 0.0130 | 8.58429 | 0.00085 | MXG_33 |
| 4| 22:12:15 | Market 1 | Buy | 659.58 | 0.0070 | 4.61706 | 0.00055 | MXG_33 |
| 5| 19:36:08 | Market 1 | Sell | 670.00 | 0.0082 | 5.49400 | 0.00070 | MXG_33 |
| 6| 19:36:08 | Market 1 | Sell | 670.33 | 0.0058 | 3.88791 | 0.00048 | MXG_33 |
| 7| 19:36:08 | Market 1 | Sell | 671.23 | 0.0060 | 4.02738 | 0.00054 | MXG_33 |
| 8| 13:01:41 | Market 1 | Buy | 667.15 | 0.0015 | 1.00073 | 0.00011 | MXG_33 |
| 9| 13:01:41 | Market 1 | Buy | 667.10 | 0.0185 | 12.3414 | 0.00132 | MXG_33 |
|10| 07:14:36 | Market 1 | Sell | 657.55 | 0.0107 | 7.03579 | 0.00079 | MXG_33 |
|11| 07:14:36 | Market 1 | Sell | 657.08 | 0.0005 | 0.32854 | 0.00004 | MXG_33 |
|12| 07:14:36 | Market 1 | Sell | 656.59 | 0.0088 | 5.77799 | 0.00071 | MXG_33 |
d_agg = {'Market':'first'
,'Type':'first'
,'Price':'mean'
,'Amount':'sum'
,'Total':'sum'
,'Fee':'sum'
,'Acc':'first'}
(df.groupby('Time', sort=False)['Market','Type','Price','Amount','Total','Fee','Acc'].agg(d_agg).reset_index())
步骤2:将订单合并回一行请购单:
| 1| Time | Market | Type | Price | Amount | Total | Fee | Acc |
| 2|-----------|-----------|-------|----------|---------|----------|----------|---------|
| 3| 22:12:15 | Market 1 | Buy | 660.33 | 0.0130 | 8.58429 | 0.00085 | MXG_33 |
| 4| 22:12:15 | Market 1 | Buy | 659.58 | 0.0070 | 4.61706 | 0.00055 | MXG_33 |
| 5| 19:36:08 | Market 1 | Sell | 670.00 | 0.0082 | 5.49400 | 0.00070 | MXG_33 |
| 6| 19:36:08 | Market 1 | Sell | 670.33 | 0.0058 | 3.88791 | 0.00048 | MXG_33 |
| 7| 19:36:08 | Market 1 | Sell | 671.23 | 0.0060 | 4.02738 | 0.00054 | MXG_33 |
| 8| 13:01:41 | Market 1 | Buy | 667.15 | 0.0015 | 1.00073 | 0.00011 | MXG_33 |
| 9| 13:01:41 | Market 1 | Buy | 667.10 | 0.0185 | 12.3414 | 0.00132 | MXG_33 |
|10| 07:14:36 | Market 1 | Sell | 657.55 | 0.0107 | 7.03579 | 0.00079 | MXG_33 |
|11| 07:14:36 | Market 1 | Sell | 657.08 | 0.0005 | 0.32854 | 0.00004 | MXG_33 |
|12| 07:14:36 | Market 1 | Sell | 656.59 | 0.0088 | 5.77799 | 0.00071 | MXG_33 |
d_agg = {'Market':'first'
,'Type':'first'
,'Price':'mean'
,'Amount':'sum'
,'Total':'sum'
,'Fee':'sum'
,'Acc':'first'}
(df.groupby('Time', sort=False)['Market','Type','Price','Amount','Total','Fee','Acc'].agg(d_agg).reset_index())
步骤3-最终结果:(但“价格”列显示的是平均价格,而不是加权平均价格)
groupby对象的.apply方法允许您在组级别处理数据并返回数据帧
def fn(group):
group['weighted_avg'] = group['Price'] * group['Amount'] / group['Amount'].sum()
return group
d_agg = {'Market':'first'
,'Type':'first'
,'weighted_avg':'sum'
,'Amount':'sum'
,'Total':'sum'
,'Fee':'sum'
,'Acc':'first'}
df.groupby('Time', sort=False).apply(fn).groupby('Time').agg(d_agg)
# if you don't understand what the code is doing, try:
print(df.groupby('Time', sort=False).apply(fn))
完美的非常感谢最后一行帮助我理解你是如何做到的,帮助我思考如何解决这些问题。