Pandas Panda dataframe列依赖于另一列
1) 新的column类出现错误-表示Pandas Panda dataframe列依赖于另一列,pandas,dataframe,series,pandas-groupby,Pandas,Dataframe,Series,Pandas Groupby,1) 新的column类出现错误-表示TypeError:invalid type comparison。我如何根据多小时设置此新专栏?当我只投入一个小时(1、2或3…)时,这就起作用了 2) TypeError:groupby()为参数“axis”获取了多个值。 我想使用两列GROUPBY。它只适用于一列。让我们试试: import pandas as pd import urllib.request import numpy as np url="https://www.misoenergy
TypeError:invalid type comparison
。我如何根据多小时设置此新专栏?当我只投入一个小时(1、2或3…)时,这就起作用了
2) TypeError:groupby()为参数“axis”获取了多个值
。
我想使用两列GROUPBY
。它只适用于一列。让我们试试:
import pandas as pd
import urllib.request
import numpy as np
url="https://www.misoenergy.org/Library/Repository/Market%20Reports/20170811_da_bc.xls"
cnstxls = urllib.request.urlopen(url)
xl = pd.ExcelFile(cnstxls)
df = xl.parse("Sheet1",skiprows=3)
constr = df.iloc[:,1:7]
constr['Class'] = np.where(constr['Hour of Occurrence'] == (1,2,3,4,5,6), 'Offpeak', 'Onpeak')
sumsp=constr.groupby('Constraint_ID','Class',axis=0)['Shadow Price'].sum().sort_values(ascending=True)`
输出:
constr['Class'] = np.where(constr['Hour of Occurrence'].isin([1,2,3,4,5,6]),'Offpeak','Onpeak')
sumsp = constr.groupby(['Constraint_ID','Class'],axis=0)['Shadow Price'].sum().sort_values(ascending=True)
print(sumsp)
sumsp.unstack('Class')
编辑unstack
以透视类:
输出:
constr['Class'] = np.where(constr['Hour of Occurrence'].isin([1,2,3,4,5,6]),'Offpeak','Onpeak')
sumsp = constr.groupby(['Constraint_ID','Class'],axis=0)['Shadow Price'].sum().sort_values(ascending=True)
print(sumsp)
sumsp.unstack('Class')
让我们试试:
import pandas as pd
import urllib.request
import numpy as np
url="https://www.misoenergy.org/Library/Repository/Market%20Reports/20170811_da_bc.xls"
cnstxls = urllib.request.urlopen(url)
xl = pd.ExcelFile(cnstxls)
df = xl.parse("Sheet1",skiprows=3)
constr = df.iloc[:,1:7]
constr['Class'] = np.where(constr['Hour of Occurrence'] == (1,2,3,4,5,6), 'Offpeak', 'Onpeak')
sumsp=constr.groupby('Constraint_ID','Class',axis=0)['Shadow Price'].sum().sort_values(ascending=True)`
输出:
constr['Class'] = np.where(constr['Hour of Occurrence'].isin([1,2,3,4,5,6]),'Offpeak','Onpeak')
sumsp = constr.groupby(['Constraint_ID','Class'],axis=0)['Shadow Price'].sum().sort_values(ascending=True)
print(sumsp)
sumsp.unstack('Class')
编辑unstack
以透视类:
输出:
constr['Class'] = np.where(constr['Hour of Occurrence'].isin([1,2,3,4,5,6]),'Offpeak','Onpeak')
sumsp = constr.groupby(['Constraint_ID','Class'],axis=0)['Shadow Price'].sum().sort_values(ascending=True)
print(sumsp)
sumsp.unstack('Class')
谢谢你,斯科特。一次跟进。我将如何按类透视这些数据,即onpeak和offpeak作为两个独立的列显示?@ShyamaSonti添加了edit。您需要
unstack('Class')
.Scott-在上面的unstack中,我想获得约束ID的“约束名称”。第一个约束名称就足够了。您可以在constr df中看到约束名称。请提供任何帮助。您只需在第一个groupby中将“Constraint_Id”替换为“Constraint Name”,其中sumsp=sumsp=constr.groupby(['Constraint Name','Class'],axis=0)['Shadow Price'].sum()。排序值(升序=True)
谢谢Scott。一次跟进。我将如何按类透视这些数据,即onpeak和offpeak作为两个独立的列显示?@ShyamaSonti添加了edit。您需要unstack('Class')
.Scott-在上面的unstack中,我想获得约束ID的“约束名称”。第一个约束名称就足够了。您可以在constr df中看到约束名称。请提供任何帮助。您只需在第一个groupby中将“Constraint_Id”替换为“Constraint Name”,其中sumsp=sumsp=constr.groupby(['Constraint Name','Class'],axis=0)['Shadow Price'].sum().sort_值(升序=True)