Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/277.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
绘制python groupby中聚合列的值时出错_Python_Pandas_Dataframe_Date - Fatal编程技术网

绘制python groupby中聚合列的值时出错

绘制python groupby中聚合列的值时出错,python,pandas,dataframe,date,Python,Pandas,Dataframe,Date,当我绘制从聚合创建的数据帧时,总是会出错 datelisting = { 'FirstClusterCommittedDockDate_grouper':['2019-11','2021-01','2021-04','2021-01','2020-12','2021-02','2020-12','2020-12','2021-03','2020-12','2021-09','2021-09','2020-11','2021-09','2021-11','2021-08'], 'FirstClus

当我绘制从聚合创建的数据帧时,总是会出错

datelisting = {
'FirstClusterCommittedDockDate_grouper':['2019-11','2021-01','2021-04','2021-01','2020-12','2021-02','2020-12','2020-12','2021-03','2020-12','2021-09','2021-09','2020-11','2021-09','2021-11','2021-08'],
'FirstClusterCommittedHandoffDate_grouper':['2020-03','2021-01','2021-06','2021-03','2021-02','2021-04','2021-02','2021-02','2021-10','2021-02','2021-10','2021-11','2020-12','2021-11','2022-01','2022-01'],
'FirstClusterCommittedLiveDate_grouper':['2020-03','2021-03','2021-06','2021-03','2021-03','2021-07','2021-03','2021-03','2021-08','2021-05','2021-12','2021-11','2020-12','2022-05','2022-01','2022-01'],
'TargetPPAPreparationStartDate_grouper':['2019-09','2020-03','2020-07','2020-06','2020-06','2020-06','2020-08','2020-08','2020-06','2020-08','2021-02','2021-02','2020-10','2020-10','2021-02','2021-01'],
'ProjectedDateLive_grouper':['2019-09','2020-03','2020-07','2020-06','2020-06','2020-06','2020-08','2020-08','2020-06','2020-08','2021-02','2021-02','2020-10','2020-10','2021-02','2021-01']
}


datesDf = pd.DataFrame(datelisting).melt().dropna().rename(columns={'variable':'DateFields','value':'DateValue'}).reset_index().drop('index',axis=1)

dfChart = datesDf.groupby(['DateFields',  'DateValue']).agg({'DateValue': ['count']}).reset_index().dropna().rename(columns = { 'count':'ItemCnt'})


dfChart.columns = ["_".join(x) for x in dfChart.columns.ravel()]

print(dfChart )
它产生了这个数据帧

DateFields_                     |DateValue_             |DateValue_ItemCnt
 ---------------------------------------------  | ----------------------------- | ----------------
FirstClusterCommittedDockDate_grouper       |2019-11                    |1
FirstClusterCommittedDockDate_grouper       |2020-11                    |1
FirstClusterCommittedDockDate_grouper       |2020-12                    |4
FirstClusterCommittedDockDate_grouper       |2021-01                    |2
FirstClusterCommittedDockDate_grouper       |2021-02                    |1
FirstClusterCommittedDockDate_grouper       |2021-03                    |1
FirstClusterCommittedDockDate_grouper       |2021-04                    |1
FirstClusterCommittedDockDate_grouper       |2021-08                    |1
FirstClusterCommittedDockDate_grouper       |2021-09                    |3
FirstClusterCommittedDockDate_grouper       |2021-11                    |1
FirstClusterCommittedHandoffDate_grouper        |2020-03                    |1
FirstClusterCommittedHandoffDate_grouper        |2020-12                    |1
FirstClusterCommittedHandoffDate_grouper        |2021-01                    |1
FirstClusterCommittedHandoffDate_grouper        |2021-02                    |4
FirstClusterCommittedHandoffDate_grouper        |2021-03                    |1
FirstClusterCommittedHandoffDate_grouper        |2021-04                    |1
FirstClusterCommittedHandoffDate_grouper        |2021-06                    |1
FirstClusterCommittedHandoffDate_grouper        |2021-10                    |2
FirstClusterCommittedHandoffDate_grouper        |2021-11                    |2
FirstClusterCommittedHandoffDate_grouper        |2022-01                    |2
FirstClusterCommittedLiveDate_grouper       |2020-03                    |1
FirstClusterCommittedLiveDate_grouper       |2020-12                    |1
FirstClusterCommittedLiveDate_grouper       |2021-03                    |5
FirstClusterCommittedLiveDate_grouper       |2021-05                    |1
FirstClusterCommittedLiveDate_grouper       |2021-06                    |1
FirstClusterCommittedLiveDate_grouper       |2021-07                    |1
FirstClusterCommittedLiveDate_grouper       |2021-08                    |1
FirstClusterCommittedLiveDate_grouper       |2021-11                    |1
FirstClusterCommittedLiveDate_grouper       |2021-12                    |1
FirstClusterCommittedLiveDate_grouper       |2022-01                    |2
FirstClusterCommittedLiveDate_grouper       |2022-05                    |1
ProjectedDateLive_grouper               |2019-09                    |1
ProjectedDateLive_grouper               |2020-03                    |1
ProjectedDateLive_grouper               |2020-06                    |4
ProjectedDateLive_grouper               |2020-07                    |1
ProjectedDateLive_grouper               |2020-08                    |3
ProjectedDateLive_grouper               |2020-10                    |2
ProjectedDateLive_grouper               |2021-01                    |1
ProjectedDateLive_grouper               |2021-02                    |3
TargetPPAPreparationStartDate_grouper       |2019-09                    |1
TargetPPAPreparationStartDate_grouper       |2020-03                    |1
TargetPPAPreparationStartDate_grouper       |2020-06                    |4
TargetPPAPreparationStartDate_grouper       |2020-07                    |1
TargetPPAPreparationStartDate_grouper       |2020-08                    |3
TargetPPAPreparationStartDate_grouper       |2020-10                    |2
TargetPPAPreparationStartDate_grouper       |2021-01                    |1
TargetPPAPreparationStartDate_grouper       |2021-02                    |3
当我尝试绘图时,会出现奇怪的错误

base = alt.Chart(dfChart).properties(width=600)

line = base.mark_line().encode(
    x='DateValue_',
    y=' DateValue_ItemCnt',
    color='DateFields_'
)

rule = base.mark_rule().encode(
    y='average(DateValue_ItemCnt)',
    color='DateFields_',
    size=alt.value(2)
)

line + rule
我得到了这个错误

ValueError:  DateValue_ItemCnt encoding field is specified without a type; the type cannot be inferred because it does not match any column in the data.

alt.LayerChart(...)
ValueError: Dataframe contains invalid column name: ('DateFields', ''). Column names must be strings
如果我去掉这行代码

dfChart.columns = ["_".join(x) for x in dfChart.columns.ravel()]
然后我得到了这个错误

ValueError:  DateValue_ItemCnt encoding field is specified without a type; the type cannot be inferred because it does not match any column in the data.

alt.LayerChart(...)
ValueError: Dataframe contains invalid column name: ('DateFields', ''). Column names must be strings

聚合操作是否导致ItemCnt列在绘图中不可用?有什么办法可以解决这个问题吗?

如果
y='DateValue\u ItemCnt'中有多余的空间,
请将其删除,代码将如下所示

base = alt.Chart(dfChart).properties(width=600)

line = base.mark_line().encode(
    x='DateValue_',
    y='DateValue_ItemCnt', # you have extra space in this y=' DateValue_ItemCnt', remove it
    color='DateFields_'
)

rule = base.mark_rule().encode(
    y='average(DateValue_ItemCnt)',
    color='DateFields_',
    size=alt.value(2)
)
line + rule
而产出将是:

注意:添加了代码的其余部分

import pandas as pd
datelisting = {
'FirstClusterCommittedDockDate_grouper':['2019-11','2021-01','2021-04','2021-01','2020-12','2021-02','2020-12','2020-12','2021-03','2020-12','2021-09','2021-09','2020-11','2021-09','2021-11','2021-08'],
'FirstClusterCommittedHandoffDate_grouper':['2020-03','2021-01','2021-06','2021-03','2021-02','2021-04','2021-02','2021-02','2021-10','2021-02','2021-10','2021-11','2020-12','2021-11','2022-01','2022-01'],
'FirstClusterCommittedLiveDate_grouper':['2020-03','2021-03','2021-06','2021-03','2021-03','2021-07','2021-03','2021-03','2021-08','2021-05','2021-12','2021-11','2020-12','2022-05','2022-01','2022-01'],
'TargetPPAPreparationStartDate_grouper':['2019-09','2020-03','2020-07','2020-06','2020-06','2020-06','2020-08','2020-08','2020-06','2020-08','2021-02','2021-02','2020-10','2020-10','2021-02','2021-01'],
'ProjectedDateLive_grouper':['2019-09','2020-03','2020-07','2020-06','2020-06','2020-06','2020-08','2020-08','2020-06','2020-08','2021-02','2021-02','2020-10','2020-10','2021-02','2021-01']
}


datesDf = pd.DataFrame(datelisting).melt().dropna().rename(columns={'variable':'DateFields','value':'DateValue'}).reset_index().drop('index',axis=1)

dfChart = datesDf.groupby(['DateFields',  'DateValue']).agg({'DateValue': ['count']}).reset_index().dropna().rename(columns = { 'count':'ItemCnt'})


dfChart.columns = ["_".join(x) for x in dfChart.columns.ravel()]
我在altair 4.1.0版上运行了这个。你可以通过电话找到这个

import altair as alt
alt.__version__

谢谢你的建议。当我运行代码并取出多余的空间时,我得到了以下错误:ValueError:Dataframe包含无效的列名:('DateFields','')。列名必须是字符串alt.LayerChart(…)添加了完整的代码,并注意到在4.1.0.ltair版本4.1.0上运行。工作得很漂亮。谢谢你的帮助!