python中的下采样

python中的下采样,python,pandas,downsampling,Python,Pandas,Downsampling,我试图减少我的数据样本,它是分钟,我的索引是日期时间。但是当我调用pandas.resample时,它只返回一列,而我的数据包含六列 import pandas as pd from matplotlib import pyplot dataset = pd.read_csv('household_power_consumption.txt', sep=';', header=0, low_memory=False, infer_datetime_format=Tru

我试图减少我的数据样本,它是分钟,我的索引是日期时间。但是当我调用pandas.resample时,它只返回一列,而我的数据包含六列

import pandas as pd             
from matplotlib import pyplot
dataset = pd.read_csv('household_power_consumption.txt', sep=';', header=0, 
low_memory=False, infer_datetime_format=True, parse_dates={'datetime': 
[0,1]}, index_col=['datetime'])  #Date and time has been combined
dataset.head();
dataset=dataset.resample('H', how='mean', label='left');
a=dataset.head();
print(a)
dataset.to_csv('Downsampled_House_data.csv');
dataset.resample
只返回一列。

如果数据文件来自,问题是一些缺少的值是

所以必要的参数
na_值='?'

dataset = pd.read_csv('household_power_consumption.txt', 
                      sep=';', 
                      header=0, 
                      low_memory=False, 
                      infer_datetime_format=True, 
                      parse_dates={'datetime': [0,1]},  #Date and time has been combined
                      index_col=['datetime'],
                      na_values='?') 
print(dataset.head())
                     Global_active_power  Global_reactive_power  Voltage  \
datetime                                                                   
2006-12-16 17:24:00                4.216                  0.418   234.84   
2006-12-16 17:25:00                5.360                  0.436   233.63   
2006-12-16 17:26:00                5.374                  0.498   233.29   
2006-12-16 17:27:00                5.388                  0.502   233.74   
2006-12-16 17:28:00                3.666                  0.528   235.68   

                     Global_intensity  Sub_metering_1  Sub_metering_2  \
datetime                                                                
2006-12-16 17:24:00              18.4             0.0             1.0   
2006-12-16 17:25:00              23.0             0.0             1.0   
2006-12-16 17:26:00              23.0             0.0             2.0   
2006-12-16 17:27:00              23.0             0.0             1.0   
2006-12-16 17:28:00              15.8             0.0             1.0   

                     Sub_metering_3  
datetime                             
2006-12-16 17:24:00            17.0  
2006-12-16 17:25:00            16.0  
2006-12-16 17:26:00            17.0  
2006-12-16 17:27:00            17.0  
2006-12-16 17:28:00            17.0  


欢迎来到SO。请仅使用相关标签。谢谢返回什么
print(dataset.info())
?以及
print(dataset.head())
?您好,您能解释一下您的输入是什么样子的,以及您期望的输出是什么样子的吗。@jezrael dataset.head只返回一列,其中包含一个字符。表示除返回的列为float64类型之外的所有其他列都是对象类型
print (dataset.info())
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2075259 entries, 2006-12-16 17:24:00 to 2010-11-26 21:02:00
Data columns (total 7 columns):
Global_active_power      float64
Global_reactive_power    float64
Voltage                  float64
Global_intensity         float64
Sub_metering_1           float64
Sub_metering_2           float64
Sub_metering_3           float64
dtypes: float64(7)
memory usage: 126.7 MB
None
dataset=dataset.resample('H', label='left').mean()
print(dataset.head())
                     Global_active_power  Global_reactive_power     Voltage  \
datetime                                                                      
2006-12-16 17:00:00             4.222889               0.229000  234.643889   
2006-12-16 18:00:00             3.632200               0.080033  234.580167   
2006-12-16 19:00:00             3.400233               0.085233  233.232500   
2006-12-16 20:00:00             3.268567               0.075100  234.071500   
2006-12-16 21:00:00             3.056467               0.076667  237.158667   

                     Global_intensity  Sub_metering_1  Sub_metering_2  \
datetime                                                                
2006-12-16 17:00:00         18.100000             0.0        0.527778   
2006-12-16 18:00:00         15.600000             0.0        6.716667   
2006-12-16 19:00:00         14.503333             0.0        1.433333   
2006-12-16 20:00:00         13.916667             0.0        0.000000   
2006-12-16 21:00:00         13.046667             0.0        0.416667   

                     Sub_metering_3  
datetime                             
2006-12-16 17:00:00       16.861111  
2006-12-16 18:00:00       16.866667  
2006-12-16 19:00:00       16.683333  
2006-12-16 20:00:00       16.783333  
2006-12-16 21:00:00       17.216667