C# TimeSeries趋势数据的重采样、聚合和插值
在分析能源需求和消费数据时,我遇到了重新采样和插值时间序列趋势数据的问题 数据集示例:C# TimeSeries趋势数据的重采样、聚合和插值,c#,sql,.net,time-series,C#,Sql,.net,Time Series,在分析能源需求和消费数据时,我遇到了重新采样和插值时间序列趋势数据的问题 数据集示例: timestamp value kWh ------------------ --------- 12/19/2011 5:43:21 PM 79178 12/19/2011 5:58:21 PM 79179.88 12/19/2011 6:13:21 PM 79182.13 12/19/2011 6:28:21 PM 79183.88 12/
timestamp value kWh
------------------ ---------
12/19/2011 5:43:21 PM 79178
12/19/2011 5:58:21 PM 79179.88
12/19/2011 6:13:21 PM 79182.13
12/19/2011 6:28:21 PM 79183.88
12/19/2011 6:43:21 PM 79185.63
根据这些观察结果,我希望通过一些聚合来根据一段时间来汇总值,并将频率设置为一个时间单位
如中所示,每小时的间隔会填补缺失数据的任何空白
timestamp value (approx)
------------------ ---------
12/19/2011 5:00:00 PM 79173
12/19/2011 6:00:00 PM 79179
12/19/2011 7:00:00 PM 79186
对于线性算法,似乎我会取时间差,然后乘以该因子的值
TimeSpan ts = current - previous;
Double factor = ts.TotalMinutes / period;
值和时间戳可以基于该因子进行计算
有这么多的可用信息,我不确定为什么很难找到最优雅的方法来解决这个问题
首先,是否有可以推荐的开源分析库
有没有关于方案方法的建议?理想情况下是C#,或者可能是SQL
或者,任何类似的问题(包括答案)都可以指向我?可能是这样的:
SELECT DATE_FORMAT('%Y-%m-%d %H', timestamp) as day_hour, AVG(value) as aprox FROM table GROUP BY day_hour
您使用的是什么数据库引擎?对于您正在执行的操作,您似乎在为启动器ts=(TimeSpan)(当前-以前)错误地声明TimeSpan;另外,请确保current和previous为DateTime类型 如果你想看计算或汇总,我会看TotalHours(),这里有一个例子,如果你愿意,你可以看一看 这里是检查上次写入/修改时间是否在24小时内
if (((TimeSpan)(DateTime.Now - fiUpdateFileFile.LastWriteTime)).TotalHours < 24){}
if(((TimeSpan)(DateTime.Now-fiupdatefile.LastWriteTime)).TotalHours<24{}
我知道这与您的情况不同,但您了解了如何使用TotalHours的要点。通过使用内部用于表示日期时间的时间刻度,您可以获得最准确的值。由于这些时间刻度不会在午夜零点重新开始,所以在白天边界处不会出现问题
// Sample times and full hour
DateTime lastSampleTimeBeforeFullHour = new DateTime(2011, 12, 19, 17, 58, 21);
DateTime firstSampleTimeAfterFullHour = new DateTime(2011, 12, 19, 18, 13, 21);
DateTime fullHour = new DateTime(2011, 12, 19, 18, 00, 00);
// Times as ticks (most accurate time unit)
long t0 = lastSampleTimeBeforeFullHour.Ticks;
long t1 = firstSampleTimeAfterFullHour.Ticks;
long tf = fullHour.Ticks;
// Energy samples
double e0 = 79179.88; // kWh before full hour
double e1 = 79182.13; // kWh after full hour
double ef; // interpolated energy at full hour
ef = e0 + (tf - t0) * (e1 - e0) / (t1 - t0); // ==> 79180.1275 kWh
公式说明在几何学中,相似三角形是形状相同但大小不同的三角形。上面的公式是基于这样一个事实,即一个三角形中任意两条边的比率对于类似三角形的相应边是相同的 如果您有一个三角形AB C和一个类似的三角形AB C,那么
a:B=a:B
。两个比率的相等称为比例
我们可以将此比例规则应用于我们的问题:
(e1 – e0) / (t1 – t0) = (ef – e0) / (tf – t0)
--- large triangle -- --- small triangle --
我编写了一个LINQ函数来插值和规范化时间序列数据,以便对其进行聚合/合并 重采样功能如下所示。我在代码项目中写了一篇关于这种技术的文章
// The function is an extension method, so it must be defined in a static class.
public static class ResampleExt
{
// Resample an input time series and create a new time series between two
// particular dates sampled at a specified time interval.
public static IEnumerable<OutputDataT> Resample<InputValueT, OutputDataT>(
// Input time series to be resampled.
this IEnumerable<InputValueT> source,
// Start date of the new time series.
DateTime startDate,
// Date at which the new time series will have ended.
DateTime endDate,
// The time interval between samples.
TimeSpan resampleInterval,
// Function that selects a date/time value from an input data point.
Func<InputValueT, DateTime> dateSelector,
// Interpolation function that produces a new interpolated data point
// at a particular time between two input data points.
Func<DateTime, InputValueT, InputValueT, double, OutputDataT> interpolator
)
{
// ... argument checking omitted ...
//
// Manually enumerate the input time series...
// This is manual because the first data point must be treated specially.
//
var e = source.GetEnumerator();
if (e.MoveNext())
{
// Initialize working date to the start date, this variable will be used to
// walk forward in time towards the end date.
var workingDate = startDate;
// Extract the first data point from the input time series.
var firstDataPoint = e.Current;
// Extract the first data point's date using the date selector.
var firstDate = dateSelector(firstDataPoint);
// Loop forward in time until we reach either the date of the first
// data point or the end date, which ever comes first.
while (workingDate < endDate && workingDate <= firstDate)
{
// Until we reach the date of the first data point,
// use the interpolation function to generate an output
// data point from the first data point.
yield return interpolator(workingDate, firstDataPoint, firstDataPoint, 0);
// Walk forward in time by the specified time period.
workingDate += resampleInterval;
}
//
// Setup current data point... we will now loop over input data points and
// interpolate between the current and next data points.
//
var curDataPoint = firstDataPoint;
var curDate = firstDate;
//
// After we have reached the first data point, loop over remaining input data points until
// either the input data points have been exhausted or we have reached the end date.
//
while (workingDate < endDate && e.MoveNext())
{
// Extract the next data point from the input time series.
var nextDataPoint = e.Current;
// Extract the next data point's date using the data selector.
var nextDate = dateSelector(nextDataPoint);
// Calculate the time span between the dates of the current and next data points.
var timeSpan = nextDate - firstDate;
// Loop forward in time until wwe have moved beyond the date of the next data point.
while (workingDate <= endDate && workingDate < nextDate)
{
// The time span from the current date to the working date.
var curTimeSpan = workingDate - curDate;
// The time between the dates as a percentage (a 0-1 value).
var timePct = curTimeSpan.TotalSeconds / timeSpan.TotalSeconds;
// Interpolate an output data point at the particular time between
// the current and next data points.
yield return interpolator(workingDate, curDataPoint, nextDataPoint, timePct);
// Walk forward in time by the specified time period.
workingDate += resampleInterval;
}
// Swap the next data point into the current data point so we can move on and continue
// the interpolation with each subsqeuent data point assuming the role of
// 'next data point' in the next iteration of this loop.
curDataPoint = nextDataPoint;
curDate = nextDate;
}
// Finally loop forward in time until we reach the end date.
while (workingDate < endDate)
{
// Interpolate an output data point generated from the last data point.
yield return interpolator(workingDate, curDataPoint, curDataPoint, 1);
// Walk forward in time by the specified time period.
workingDate += resampleInterval;
}
}
}
}
//该函数是一个扩展方法,因此必须在静态类中定义它。
公共静态类重采样
{
//对输入时间序列重新采样,并在两个时间序列之间创建新的时间序列
//在指定的时间间隔采样的特定日期。
公共静态IEnumerable重采样(
//输入要重新采样的时间序列。
这是一个数不清的来源,
//新时间序列的开始日期。
日期时间开始日期,
//新时间序列结束的日期。
日期时间结束日期,
//样本之间的时间间隔。
TimeSpan重采样间隔,
//从输入数据点选择日期/时间值的函数。
Func日期选择器,
//生成新插值数据点的插值函数
//在两个输入数据点之间的特定时间。
Func插值器
)
{
//…忽略参数检查。。。
//
//手动枚举输入时间序列。。。
//这是手动的,因为必须对第一个数据点进行特殊处理。
//
var e=source.GetEnumerator();
if(如MoveNext())
{
//将工作日期初始化为开始日期,此变量将用于
//在时间上向前走到结束日期。
var workingDate=起始日期;
//从输入时间序列中提取第一个数据点。
var firstDataPoint=e.电流;
//使用日期选择器提取第一个数据点的日期。
var firstDate=dateSelector(firstDataPoint);
//在时间上循环前进,直到我们到达第一个日期
//数据点或结束日期,哪一个先到。
while(workingDate