Python 3.x python xarray仅在特定日期对变量重新采样

Python 3.x python xarray仅在特定日期对变量重新采样,python-3.x,pandas,python-xarray,Python 3.x,Pandas,Python Xarray,我有一个Xarray数据集,每天的数据值不规则。有时一天有两个值,有时几天有间隔 [Timestamp('2015-04-01 00:00:00'), Timestamp('2015-04-01 00:00:00'), Timestamp('2015-04-03 00:00:00'), Timestamp('2015-04-03 00:00:00'), Timestamp('2015-04-05 00:00:00'), Timestamp('2015-04-06 00:00:00'),

我有一个Xarray数据集,每天的数据值不规则。有时一天有两个值,有时几天有间隔

[Timestamp('2015-04-01 00:00:00'),
 Timestamp('2015-04-01 00:00:00'),
 Timestamp('2015-04-03 00:00:00'),
 Timestamp('2015-04-03 00:00:00'),
 Timestamp('2015-04-05 00:00:00'),
 Timestamp('2015-04-06 00:00:00'),
 Timestamp('2015-04-06 00:00:00')]
如果我应用重采样()

我最终得到了

[Timestamp('2015-04-01 00:00:00'),
 Timestamp('2015-04-02 00:00:00'),
 Timestamp('2015-04-03 00:00:00'),
 Timestamp('2015-04-04 00:00:00'),
 Timestamp('2015-04-05 00:00:00'),
 Timestamp('2015-04-06 00:00:00'),
 Timestamp('2015-04-07 00:00:00')]
但我正在寻找像这样的数据重采样

[Timestamp('2015-04-01 00:00:00'),
 Timestamp('2015-04-03 00:00:00'),
 Timestamp('2015-04-05 00:00:00'),
 Timestamp('2015-04-06 00:00:00')]
在不向模型中添加新时间的情况下,我必须使用哪些选项才能获得相等天数的.mean()值?我尝试在一个小样本中重现问题:

value_1 = np.arange(0,7,1)
times = np.array(['2015-04-01', '2015-04-01', '2018-01-03', '2018-01-03', '2018-01-05', '2018-01-05', '2018-01-06'], dtype='datetime64')

time_ = xr.Dataset(
        data_vars={'value':    (('time'), value_1)},
        coords={'time': times})

time_resample = time_.resample(time='1D').mean().sel(time=slice('2015-04-01', '2015-04-06'))

print(time_.time, time_resample.time)


<xarray.DataArray 'time' (time: 7)>
array(['2015-04-01T00:00:00.000000000', '2015-04-01T00:00:00.000000000',
       '2018-01-03T00:00:00.000000000', '2018-01-03T00:00:00.000000000',
       '2018-01-05T00:00:00.000000000', '2018-01-05T00:00:00.000000000',
       '2018-01-06T00:00:00.000000000'], dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2015-04-01 2015-04-01 ... 2018-01-06 <xarray.DataArray 'time' (time: 6)>
array(['2015-04-01T00:00:00.000000000', '2015-04-02T00:00:00.000000000',
       '2015-04-03T00:00:00.000000000', '2015-04-04T00:00:00.000000000',
       '2015-04-05T00:00:00.000000000', '2015-04-06T00:00:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2015-04-01 2015-04-02 ... 2015-04-06
value_1=np.arange(0,7,1)
时间=np.数组(['2015-04-01','2015-04-01','2018-01-03','2018-01-03','2018-01-05','2018-01-05','2018-01-06'],数据类型='datetime64')
时间=数据集(
data_vars={'value':('time'),value_1},
coords={'time':times})
时间重采样=时间重采样(time='1D').mean().sel(时间=切片('2015-04-01','2015-04-06'))
打印(时间\时间,时间\重采样时间)
数组(['2015-04-01T00:00:00.000000000','2015-04-01T00:00:00.000000000',
“2018-01-03T00:00:00.000000000”、“2018-01-03T00:00:00.000000000”,
“2018-01-05T00:00:00.000000000”、“2018-01-05T00:00:00.000000000”,
'2018-01-06T00:00:00.000000000',dtype='datetime64[ns]]
协调:
*时间(时间)日期时间64[ns]2015-04-01 2015-04-01。。。2018-01-06 
数组(['2015-04-01T00:00:00.000000000','2015-04-02T00:00:00.000000000',
“2015-04-03T00:00:00.000000000”、“2015-04-04T00:00:00.000000000”,
“2015-04-05T00:00:00.000000000”、“2015-04-06T00:00:00.000000000”],
dtype='datetime64[ns]')
协调:
*时间日期时间64[ns]2015-04-01 2015-04-02。。。2015-04-06

您必须按
时间进行分组
并应用函数mean

time_groupby = time_.value.groupby('time').mean()

在这一点上,沙雷与熊猫非常相似

groupby('Date')
或类似的东西,而不是
重采样
。如果您解决了问题,您可以回答问题并接受它(或任何其他答案)。这是一种比在问题内部编辑解决方案更好的方法
time_groupby = time_.value.groupby('time').mean()