Pandas Panda dataframe列依赖于另一列

Pandas Panda dataframe列依赖于另一列,pandas,dataframe,series,pandas-groupby,Pandas,Dataframe,Series,Pandas Groupby,1) 新的column类出现错误-表示TypeError:invalid type comparison。我如何根据多小时设置此新专栏?当我只投入一个小时(1、2或3…)时,这就起作用了 2) TypeError:groupby()为参数“axis”获取了多个值。 我想使用两列GROUPBY。它只适用于一列。让我们试试: import pandas as pd import urllib.request import numpy as np url="https://www.misoenergy

1) 新的column类出现错误-表示
TypeError:invalid type comparison
。我如何根据多小时设置此新专栏?当我只投入一个小时(1、2或3…)时,这就起作用了

2)
TypeError:groupby()为参数“axis”获取了多个值
。 我想使用两列
GROUPBY
。它只适用于一列。

让我们试试:

import pandas as pd
import urllib.request
import numpy as np
url="https://www.misoenergy.org/Library/Repository/Market%20Reports/20170811_da_bc.xls"

cnstxls = urllib.request.urlopen(url)
xl = pd.ExcelFile(cnstxls)
df = xl.parse("Sheet1",skiprows=3)
constr = df.iloc[:,1:7]
constr['Class'] = np.where(constr['Hour of Occurrence'] == (1,2,3,4,5,6), 'Offpeak', 'Onpeak')
sumsp=constr.groupby('Constraint_ID','Class',axis=0)['Shadow Price'].sum().sort_values(ascending=True)`
输出:

constr['Class'] = np.where(constr['Hour of Occurrence'].isin([1,2,3,4,5,6]),'Offpeak','Onpeak')

sumsp = constr.groupby(['Constraint_ID','Class'],axis=0)['Shadow Price'].sum().sort_values(ascending=True)

print(sumsp)
sumsp.unstack('Class')
编辑
unstack
以透视类: 输出:

constr['Class'] = np.where(constr['Hour of Occurrence'].isin([1,2,3,4,5,6]),'Offpeak','Onpeak')

sumsp = constr.groupby(['Constraint_ID','Class'],axis=0)['Shadow Price'].sum().sort_values(ascending=True)

print(sumsp)
sumsp.unstack('Class')
让我们试试:

import pandas as pd
import urllib.request
import numpy as np
url="https://www.misoenergy.org/Library/Repository/Market%20Reports/20170811_da_bc.xls"

cnstxls = urllib.request.urlopen(url)
xl = pd.ExcelFile(cnstxls)
df = xl.parse("Sheet1",skiprows=3)
constr = df.iloc[:,1:7]
constr['Class'] = np.where(constr['Hour of Occurrence'] == (1,2,3,4,5,6), 'Offpeak', 'Onpeak')
sumsp=constr.groupby('Constraint_ID','Class',axis=0)['Shadow Price'].sum().sort_values(ascending=True)`
输出:

constr['Class'] = np.where(constr['Hour of Occurrence'].isin([1,2,3,4,5,6]),'Offpeak','Onpeak')

sumsp = constr.groupby(['Constraint_ID','Class'],axis=0)['Shadow Price'].sum().sort_values(ascending=True)

print(sumsp)
sumsp.unstack('Class')
编辑
unstack
以透视类: 输出:

constr['Class'] = np.where(constr['Hour of Occurrence'].isin([1,2,3,4,5,6]),'Offpeak','Onpeak')

sumsp = constr.groupby(['Constraint_ID','Class'],axis=0)['Shadow Price'].sum().sort_values(ascending=True)

print(sumsp)
sumsp.unstack('Class')

谢谢你,斯科特。一次跟进。我将如何按类透视这些数据,即onpeak和offpeak作为两个独立的列显示?@ShyamaSonti添加了edit。您需要
unstack('Class')
.Scott-在上面的unstack中,我想获得约束ID的“约束名称”。第一个约束名称就足够了。您可以在constr df中看到约束名称。请提供任何帮助。您只需在第一个groupby中将“Constraint_Id”替换为“Constraint Name”,其中sumsp=
sumsp=constr.groupby(['Constraint Name','Class'],axis=0)['Shadow Price'].sum()。排序值(升序=True)
谢谢Scott。一次跟进。我将如何按类透视这些数据,即onpeak和offpeak作为两个独立的列显示?@ShyamaSonti添加了edit。您需要
unstack('Class')
.Scott-在上面的unstack中,我想获得约束ID的“约束名称”。第一个约束名称就足够了。您可以在constr df中看到约束名称。请提供任何帮助。您只需在第一个groupby中将“Constraint_Id”替换为“Constraint Name”,其中sumsp=
sumsp=constr.groupby(['Constraint Name','Class'],axis=0)['Shadow Price'].sum().sort_值(升序=True)