Python 如何对没有数值的数据帧应用重采样_Python_Pandas_Dataframe

Python 如何对没有数值的数据帧应用重采样

python pandas dataframe

Python 如何对没有数值的数据帧应用重采样,python,pandas,dataframe,Python,Pandas,Dataframe,我了解到，resample不能应用非数字值并且，我想对输入df应用重采样（'30S'），如下所示：输入_DF： eventTime uuid ts m_op op.prg op.tr w.cycle cycle_type 0 2017-04-27 01:22:22 id1 2017-04-27 02:30:30 w 0.0 01:34:48 3 type_a

我了解到，

resample

不能应用非数字值并且，我想对输入df应用

重采样（'30S'）

，如下所示：

输入_DF：

    eventTime          uuid  ts                   m_op  op.prg  op.tr     w.cycle cycle_type
0  2017-04-27 01:22:22 id1  2017-04-27 02:30:30   w     0.0     01:34:48  3       type_a                                                       
1  2017-04-27 01:23:16 id1  2017-04-27 02:31:00   w     1.0     01:33:54  3       type_a                                                      
2  2017-04-27 01:25:10 id1  2017-04-27 02:41:00   w     2.0     01:33:00  3       type_a                                                      
3  2017-04-27 01:25:32 id1  2017-04-27 02:42:45   w     3.0     01:32:00  3       type_a                                                     
4  2017-04-27 01:25:45 id1  2017-04-27 02:52:45   r     4.0     01:32:00  2       type_a

输出功率

    eventTime          uuid  ts                   m_op  op.prg  op.tr     w.cycle cycle_type
0  2017-04-27 01:22:30 id1  2017-04-27 02:30:30   w     0.0     01:34:48  3       type_a                                                       
1  2017-04-27 01:23:00 id1  2017-04-27 02:30:30   w     0.0     01:34:48  3       type_a                                                       
2  2017-04-27 01:23:30 id1  2017-04-27 02:31:00   w     1.0     01:33:54  3       type_a                                                      
3  2017-04-27 01:24:00 id1  2017-04-27 02:31:00   w     1.0     01:33:54  3       type_a                                                      
4  2017-04-27 01:24:30 id1  2017-04-27 02:31:00   w     1.0     01:33:54  3       type_a                                                      
5  2017-04-27 01:25:00 id1  2017-04-27 02:31:00   w     1.0     01:33:54  3       type_a                                                      
6  2017-04-27 01:25:30 id1  2017-04-27 02:41:00   w     2.0     01:33:00  3       type_a                                                      
7  2017-04-27 01:26:00 id1             avg    +popular  3.5     Avg      +popular       type_a

其中，

avg\u\u值

计算相应时间内的平均值，

+popular

填充最流行的值或第一个值（如果两个值具有相同的秩），并且

avg

是通常的平均值

我一直在应用

groupBy

方法，但没有成功。

如果您有任何建议，我们将不胜感激。非常感谢。carlo

我实施的第一个解决方案是将整个问题拆分为子问题中的数字-每列一个非数字-然后合并得到的解决方案。下面，我报告了用于解决

m_op

案例的部分代码：

smart_sub_schema_Operation_progress=["eventTime","uuid", "m_op"]    
operation_progress_sub_smart_home_df=smart_home_df[smart_sub_schema_Operation_progress]
operation_progress_sub_smart_home_df['m_op'] = operation_progress_sub_smart_home_df['m_op'].map({'P':0, 'W': 1, 'R': 2, 'S': 3,  'F': 4})
operation_progress_sub_smart_home_df.eventTime = pd.to_datetime(operation_progress_sub_smart_home_df.eventTime)
operation_progress_sub_smart_home_df.index = operation_progress_sub_smart_home_df['eventTime']
resampled_operation_progress_sub_smart_home_df=operation_progress_sub_smart_home_df.resample('30S').reset_index()
resampled_operation_progress_sub_smart_home_df["m_op"]=resampled_operation_progress_sub_smart_home_df["m_op"].astype(float)
resampled_operation_progress_sub_smart_home_df.fillna(method='ffill', inplace=True)
resampled_operation_progress_sub_smart_home_df['m_op'] = resampled_operation_progress_sub_smart_home_df['m_op'].map({0.0:'P', 1.0:'W', 2.0:'R', 3.0:'S',  4.0:'F'})
print(resampled_operation_progress_sub_smart_home_df.to_string())