Java 递归时间序列分割算法_Java_Algorithm_Recursion_Time Series_Linear Regression

Java 递归时间序列分割算法

java algorithm recursion

Java 递归时间序列分割算法,java,algorithm,recursion,time-series,linear-regression,Java,Algorithm,Recursion,Time Series,Linear Regression,我正在对股市数据进行时间序列分析，并试图实现一种分段线性分割算法，如下所示： split(T [ta, tb ]) – split a time series T of length n from time ta to time tb where 0 ≤ a < b ≤ n 1: Ttemp = ∅ 2: εmin = ∞; 3: εtotal = 0; 4: for i = a to b do 5:εi = (pi

我正在对股市数据进行时间序列分析，并试图实现一种分段线性分割算法，如下所示：

    split(T [ta, tb ]) – split a time series T of length
    n from time ta to time tb where 0 ≤ a < b ≤ n
    1: Ttemp = ∅
    2: εmin = ∞;
    3: εtotal = 0;
    4: for i = a to b do
            5:εi = (pi − pi )^2 ;
            6:if εmin > εi then
                7:  εmin = εi ;
                8:  tk = ti ;
            9:end if
        10:εtotal = εtotal + εi ;
    11: end for
    12: ε = εtotal /(tb − ta );
    13: if t-test.reject(ε) then
            14:Ttemp = Ttemp ∪ split(T [ta , tk ]);
            15:Ttemp = Ttemp ∪ split(T [tk , tb ]);
        16: end if
    17: return Ttemp ;

class MySeries{
      ArrayList<Date> time;
      Double[] value;
}

split（T[ta，tb]）–分割长度为T的时间序列
n从时间ta到时间tb，其中0≤ aεi，则
7：εmin=εi；
8:tk=ti；
9：如果结束
10：εtotal=εtotal+εi；
11：结束
12：ε=ε总/（tb）− ta）；
13：如果t-检验。拒收（ε），则
14:Ttemp=Ttemp∪ 分裂（T[ta，tk]）；
15:Ttemp=Ttemp∪ 分裂（T[tk，tb]）；
16：如果结束
17：返回Ttemp；

我的时间序列课程如下：

    split(T [ta, tb ]) – split a time series T of length
    n from time ta to time tb where 0 ≤ a < b ≤ n
    1: Ttemp = ∅
    2: εmin = ∞;
    3: εtotal = 0;
    4: for i = a to b do
            5:εi = (pi − pi )^2 ;
            6:if εmin > εi then
                7:  εmin = εi ;
                8:  tk = ti ;
            9:end if
        10:εtotal = εtotal + εi ;
    11: end for
    12: ε = εtotal /(tb − ta );
    13: if t-test.reject(ε) then
            14:Ttemp = Ttemp ∪ split(T [ta , tk ]);
            15:Ttemp = Ttemp ∪ split(T [tk , tb ]);
        16: end if
    17: return Ttemp ;

class MySeries{
      ArrayList<Date> time;
      Double[] value;
}

classmyseries{
阵列列表时间；
双[]值；
}

在上述算法中，Ttemp是timeseries的另一个实例。第4-12行的计算用于计算误差。
问题是我不能实现上面的递归和并集部分（第14行和第15行）。我不清楚如何递归和实现MySeries对象的并集

编辑

class Segmentation{
    static MySeries series1 = new MySeries();    //contains the complete time series
    static HashSet<MySeries> series_set = new HashSet<MySeries>();    

    public static MySeries split(MySeries series, int start, int limit) throws ParseException{      
        if(limit-start < 3){     //get min of 3 readings atleast
        return null;
        }

    tTemp = MySeries.createSegment(series1, start, limit);

    double emin = 999999999, e,etotal=0, p, pcap;
    DescriptiveStatistics errors = new DescriptiveStatistics();

    for(int i=start;i<limit;i++){
        p = series1.y[i];
        pcap = series1.regress.predict(series1.x[i]);
        e = (p-pcap)*(p-pcap);
        errors.addValue(e);
        if(emin > e){
            emin = e;
            splitPoint = i;
        }
        etotal = etotal + e;
    }
    e = etotal/(limit-start);

    double std_dev_error = errors.getStandardDeviation();
    double tTstatistic = e/(std_dev_error/Math.sqrt(errors.getN()));

        if(ttest.tTest(tTstatistic, errors, 0.10)){
            union(split(series1, start, splitPoint));
            union(split(series1, splitPoint+1, limit));
        }
    return tTemp;
}

    static void union(MySeries ms){
        series_set.add(ms);    
    }
}

类分割{
static MySeries series1=new MySeries（）；//包含完整的时间序列
静态HashSet系列_set=新HashSet（）；
公共静态MySeries拆分（MySeries，int start，int limit）抛出语法异常{
如果（限制开始<3）{//至少获得3个读数的最小值
返回null；
}
tTemp=MySeries.createSegment（series1，start，limit）；
双emin=99999999，e，etotal=0，p，pcap；
DescriptiveStatistics errors=新DescriptiveStatistics（）；
for（int i=开始；i e）{
emin=e；
splitPoint=i；
}
etotal=etotal+e；
}
e=etotal/（极限启动）；
double std_dev_error=errors.getStandardDiversion（）；
双tTstatistic=e/（std_dev_error/Math.sqrt（errors.getN（））；
如果（t测试t测试（t统计，误差，0.10））{
联合（拆分（系列1、起点、拆分点））；
并集（拆分（系列1，拆分点+1，限制））；
}
返回tTemp；
}
静态无效联合（MySeries ms）{
系列集合添加（ms）；
}
}

我已经为给定的算法编写了上面的代码，但是我不知道为什么它会运行到无限循环中。。如果有人能为我提供代码的任何其他设计或修改，我将不胜感激

我不知道为什么它会进入无限循环

很容易找到原因。只需插入一些print语句即可查看发生了什么（或使用调试器）。比如说,

    if(ttest.tTest(tTstatistic, errors, 0.10)){
        System.out.printf("About to split %d .. %d .. %d%n", start, splitPoint, limit);
        union(split(series1, start, splitPoint));
        union(split(series1, splitPoint+1, limit));
    }
    else
        System.out.printf("Not splitting %d .. %d%n", start, limit);

你的εi总是零！因此，εi=（pi-pi）^2后面的if语句将始终为真

（pi-pi）^2

——这不就是

？不，它实际上是（pi-pi\u cap）^2.数学术语..不用麻烦了。我们是

拆分

函数的代码？当你们得到它时，我觉得你们只需要做一个集合的并集（u）（相当于

hashSet.addAll

。抱歉，这个错误..算法的名称本身被拆分..因此在第14行和第15行上它递归地调用自己。@Perception的意思是，当您实现方法时，您可以使用

hashSet

作为Ttemp的类型，并使用类似

Ttemp.addAll（拆分（timeseries））的行）；

执行递归调用返回的数据的联接。