绘制python groupby中聚合列的值时出错
当我绘制从聚合创建的数据帧时,总是会出错绘制python groupby中聚合列的值时出错,python,pandas,dataframe,date,Python,Pandas,Dataframe,Date,当我绘制从聚合创建的数据帧时,总是会出错 datelisting = { 'FirstClusterCommittedDockDate_grouper':['2019-11','2021-01','2021-04','2021-01','2020-12','2021-02','2020-12','2020-12','2021-03','2020-12','2021-09','2021-09','2020-11','2021-09','2021-11','2021-08'], 'FirstClus
datelisting = {
'FirstClusterCommittedDockDate_grouper':['2019-11','2021-01','2021-04','2021-01','2020-12','2021-02','2020-12','2020-12','2021-03','2020-12','2021-09','2021-09','2020-11','2021-09','2021-11','2021-08'],
'FirstClusterCommittedHandoffDate_grouper':['2020-03','2021-01','2021-06','2021-03','2021-02','2021-04','2021-02','2021-02','2021-10','2021-02','2021-10','2021-11','2020-12','2021-11','2022-01','2022-01'],
'FirstClusterCommittedLiveDate_grouper':['2020-03','2021-03','2021-06','2021-03','2021-03','2021-07','2021-03','2021-03','2021-08','2021-05','2021-12','2021-11','2020-12','2022-05','2022-01','2022-01'],
'TargetPPAPreparationStartDate_grouper':['2019-09','2020-03','2020-07','2020-06','2020-06','2020-06','2020-08','2020-08','2020-06','2020-08','2021-02','2021-02','2020-10','2020-10','2021-02','2021-01'],
'ProjectedDateLive_grouper':['2019-09','2020-03','2020-07','2020-06','2020-06','2020-06','2020-08','2020-08','2020-06','2020-08','2021-02','2021-02','2020-10','2020-10','2021-02','2021-01']
}
datesDf = pd.DataFrame(datelisting).melt().dropna().rename(columns={'variable':'DateFields','value':'DateValue'}).reset_index().drop('index',axis=1)
dfChart = datesDf.groupby(['DateFields', 'DateValue']).agg({'DateValue': ['count']}).reset_index().dropna().rename(columns = { 'count':'ItemCnt'})
dfChart.columns = ["_".join(x) for x in dfChart.columns.ravel()]
print(dfChart )
它产生了这个数据帧
DateFields_ |DateValue_ |DateValue_ItemCnt
--------------------------------------------- | ----------------------------- | ----------------
FirstClusterCommittedDockDate_grouper |2019-11 |1
FirstClusterCommittedDockDate_grouper |2020-11 |1
FirstClusterCommittedDockDate_grouper |2020-12 |4
FirstClusterCommittedDockDate_grouper |2021-01 |2
FirstClusterCommittedDockDate_grouper |2021-02 |1
FirstClusterCommittedDockDate_grouper |2021-03 |1
FirstClusterCommittedDockDate_grouper |2021-04 |1
FirstClusterCommittedDockDate_grouper |2021-08 |1
FirstClusterCommittedDockDate_grouper |2021-09 |3
FirstClusterCommittedDockDate_grouper |2021-11 |1
FirstClusterCommittedHandoffDate_grouper |2020-03 |1
FirstClusterCommittedHandoffDate_grouper |2020-12 |1
FirstClusterCommittedHandoffDate_grouper |2021-01 |1
FirstClusterCommittedHandoffDate_grouper |2021-02 |4
FirstClusterCommittedHandoffDate_grouper |2021-03 |1
FirstClusterCommittedHandoffDate_grouper |2021-04 |1
FirstClusterCommittedHandoffDate_grouper |2021-06 |1
FirstClusterCommittedHandoffDate_grouper |2021-10 |2
FirstClusterCommittedHandoffDate_grouper |2021-11 |2
FirstClusterCommittedHandoffDate_grouper |2022-01 |2
FirstClusterCommittedLiveDate_grouper |2020-03 |1
FirstClusterCommittedLiveDate_grouper |2020-12 |1
FirstClusterCommittedLiveDate_grouper |2021-03 |5
FirstClusterCommittedLiveDate_grouper |2021-05 |1
FirstClusterCommittedLiveDate_grouper |2021-06 |1
FirstClusterCommittedLiveDate_grouper |2021-07 |1
FirstClusterCommittedLiveDate_grouper |2021-08 |1
FirstClusterCommittedLiveDate_grouper |2021-11 |1
FirstClusterCommittedLiveDate_grouper |2021-12 |1
FirstClusterCommittedLiveDate_grouper |2022-01 |2
FirstClusterCommittedLiveDate_grouper |2022-05 |1
ProjectedDateLive_grouper |2019-09 |1
ProjectedDateLive_grouper |2020-03 |1
ProjectedDateLive_grouper |2020-06 |4
ProjectedDateLive_grouper |2020-07 |1
ProjectedDateLive_grouper |2020-08 |3
ProjectedDateLive_grouper |2020-10 |2
ProjectedDateLive_grouper |2021-01 |1
ProjectedDateLive_grouper |2021-02 |3
TargetPPAPreparationStartDate_grouper |2019-09 |1
TargetPPAPreparationStartDate_grouper |2020-03 |1
TargetPPAPreparationStartDate_grouper |2020-06 |4
TargetPPAPreparationStartDate_grouper |2020-07 |1
TargetPPAPreparationStartDate_grouper |2020-08 |3
TargetPPAPreparationStartDate_grouper |2020-10 |2
TargetPPAPreparationStartDate_grouper |2021-01 |1
TargetPPAPreparationStartDate_grouper |2021-02 |3
当我尝试绘图时,会出现奇怪的错误
base = alt.Chart(dfChart).properties(width=600)
line = base.mark_line().encode(
x='DateValue_',
y=' DateValue_ItemCnt',
color='DateFields_'
)
rule = base.mark_rule().encode(
y='average(DateValue_ItemCnt)',
color='DateFields_',
size=alt.value(2)
)
line + rule
我得到了这个错误
ValueError: DateValue_ItemCnt encoding field is specified without a type; the type cannot be inferred because it does not match any column in the data.
alt.LayerChart(...)
ValueError: Dataframe contains invalid column name: ('DateFields', ''). Column names must be strings
如果我去掉这行代码
dfChart.columns = ["_".join(x) for x in dfChart.columns.ravel()]
然后我得到了这个错误
ValueError: DateValue_ItemCnt encoding field is specified without a type; the type cannot be inferred because it does not match any column in the data.
alt.LayerChart(...)
ValueError: Dataframe contains invalid column name: ('DateFields', ''). Column names must be strings
聚合操作是否导致ItemCnt列在绘图中不可用?有什么办法可以解决这个问题吗?如果
y='DateValue\u ItemCnt'中有多余的空间,
请将其删除,代码将如下所示
base = alt.Chart(dfChart).properties(width=600)
line = base.mark_line().encode(
x='DateValue_',
y='DateValue_ItemCnt', # you have extra space in this y=' DateValue_ItemCnt', remove it
color='DateFields_'
)
rule = base.mark_rule().encode(
y='average(DateValue_ItemCnt)',
color='DateFields_',
size=alt.value(2)
)
line + rule
而产出将是:
注意:添加了代码的其余部分
import pandas as pd
datelisting = {
'FirstClusterCommittedDockDate_grouper':['2019-11','2021-01','2021-04','2021-01','2020-12','2021-02','2020-12','2020-12','2021-03','2020-12','2021-09','2021-09','2020-11','2021-09','2021-11','2021-08'],
'FirstClusterCommittedHandoffDate_grouper':['2020-03','2021-01','2021-06','2021-03','2021-02','2021-04','2021-02','2021-02','2021-10','2021-02','2021-10','2021-11','2020-12','2021-11','2022-01','2022-01'],
'FirstClusterCommittedLiveDate_grouper':['2020-03','2021-03','2021-06','2021-03','2021-03','2021-07','2021-03','2021-03','2021-08','2021-05','2021-12','2021-11','2020-12','2022-05','2022-01','2022-01'],
'TargetPPAPreparationStartDate_grouper':['2019-09','2020-03','2020-07','2020-06','2020-06','2020-06','2020-08','2020-08','2020-06','2020-08','2021-02','2021-02','2020-10','2020-10','2021-02','2021-01'],
'ProjectedDateLive_grouper':['2019-09','2020-03','2020-07','2020-06','2020-06','2020-06','2020-08','2020-08','2020-06','2020-08','2021-02','2021-02','2020-10','2020-10','2021-02','2021-01']
}
datesDf = pd.DataFrame(datelisting).melt().dropna().rename(columns={'variable':'DateFields','value':'DateValue'}).reset_index().drop('index',axis=1)
dfChart = datesDf.groupby(['DateFields', 'DateValue']).agg({'DateValue': ['count']}).reset_index().dropna().rename(columns = { 'count':'ItemCnt'})
dfChart.columns = ["_".join(x) for x in dfChart.columns.ravel()]
我在altair 4.1.0版上运行了这个。你可以通过电话找到这个
import altair as alt
alt.__version__
谢谢你的建议。当我运行代码并取出多余的空间时,我得到了以下错误:ValueError:Dataframe包含无效的列名:('DateFields','')。列名必须是字符串alt.LayerChart(…)添加了完整的代码,并注意到在4.1.0.ltair版本4.1.0上运行。工作得很漂亮。谢谢你的帮助!