Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/281.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫按日期分组和计数。然后将计数转换为列名_Python_Pandas - Fatal编程技术网

Python 熊猫按日期分组和计数。然后将计数转换为列名

Python 熊猫按日期分组和计数。然后将计数转换为列名,python,pandas,Python,Pandas,我有这个数据框 import pandas as pd from datetime import datetime df = pd.DataFrame([ {"_id": "1", "date": datetime.strptime("2020-09-29 07:00:00", '%Y-%m-%d %H:%M:%S'), "status": "started"},

我有这个数据框

import pandas as pd
from datetime import datetime
df = pd.DataFrame([
    {"_id": "1", "date": datetime.strptime("2020-09-29 07:00:00", '%Y-%m-%d %H:%M:%S'), "status": "started"},
    {"_id": "2", "date": datetime.strptime("2020-09-29 14:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"},
    {"_id": "3", "date": datetime.strptime("2020-09-25 17:00:00", '%Y-%m-%d %H:%M:%S'), "status": "started"},
    {"_id": "4", "date": datetime.strptime("2020-09-17 09:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"},
    {"_id": "5", "date": datetime.strptime("2020-09-19 07:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"},
    {"_id": "6", "date": datetime.strptime("2020-09-19 08:00:00", '%Y-%m-%d %H:%M:%S'), "status": "end"},
]).set_index('date')
看起来是这样的:

                    _id   status
date                            
2020-09-29 07:00:00   1  started
2020-09-29 14:00:00   2      end
2020-09-25 17:00:00   3  started
2020-09-17 09:00:00   4      end
2020-09-19 07:00:00   5      end

我试着按天分组并计算每个状态。但是我想在列名中包含该名称的名称

以下是所需的输出:

                      status_started  status_end
date
2020-09-29 07:00:00    1                1
2020-09-25 17:00:00    1                0
2020-09-17 09:00:00    0                1
2020-09-19 07:00:00    0                2

我试过这个:

df = df.groupby([pd.Grouper(freq='d'), 'status']).agg({'status': "count"})
df = df.reset_index(level="status")

out: 
                    status
date       status         
2020-09-17 end           1
2020-09-19 end           2
2020-09-25 started       1
2020-09-29 end           1
2020-09-29 started       1

但是没有成功地转换df。

您只需要
取消堆栈

df.groupby([pd.Grouper(freq='d'), 'status']).size().unstack('status', fill_value=0)
输出:

status      end  started
date                    
2020-09-17    1        0
2020-09-19    2        0
2020-09-25    0        1
2020-09-29    1        1

您只需
取消堆叠

df.groupby([pd.Grouper(freq='d'), 'status']).size().unstack('status', fill_value=0)
输出:

status      end  started
date                    
2020-09-17    1        0
2020-09-19    2        0
2020-09-25    0        1
2020-09-29    1        1

您可以尝试
交叉表

d = pd.crosstab(df.index.date, df['status'])\
      .rename_axis('date').add_prefix('status_')


您可以尝试
交叉表

d = pd.crosstab(df.index.date, df['status'])\
      .rename_axis('date').add_prefix('status_')