Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/assembly/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Stream 基于流学习的Weka阈值选择器和CostSensitiveClassifier_Stream_Machine Learning_Weka - Fatal编程技术网

Stream 基于流学习的Weka阈值选择器和CostSensitiveClassifier

Stream 基于流学习的Weka阈值选择器和CostSensitiveClassifier,stream,machine-learning,weka,Stream,Machine Learning,Weka,Weka的ThresholdSelector和/或CostSensitiveClassifier是否与流学习(可更新分类器)兼容?我的目标是将它们与weka.classifiers.meta.MOA一起使用,以便将学习重点放在特定的类上,并将某些不平衡数据上的FN最小化 非常感谢 下面的答案是,或阈值选择器或CostSensitiveClassifier都不支持可更新的分类器。因此,目前不可能使用这些元分类器进行流式学习 因此,我提出了一个代码草案来创建这些分类器的可更新版本。欢迎提出任何意见/

Weka的ThresholdSelector和/或CostSensitiveClassifier是否与流学习(可更新分类器)兼容?我的目标是将它们与weka.classifiers.meta.MOA一起使用,以便将学习重点放在特定的类上,并将某些不平衡数据上的FN最小化

非常感谢

下面的答案是,或阈值选择器或CostSensitiveClassifier都不支持可更新的分类器。因此,目前不可能使用这些元分类器进行流式学习

因此,我提出了一个代码草案来创建这些分类器的可更新版本。欢迎提出任何意见/建议

weka.classifiers.meta.CostSensitiveClassifier代码更新以创建可更新的版本(此版本“似乎”最简单)

weka.classifiers.meta.ThresholdSelector代码更新以创建可更新版本(等待您的评论/建议):

谢谢

/*
   weka.classifiers.meta.CostSensitiveClassifier: draft code update and questions to make it compatible with updateable classifiers
*/

import weka.classifiers.UpdateableClassifier;
....
implements ... UpdateableClassifier;
...
protected boolean classifierAlreadyUpdated = False;

public void updateClassifier(Instance instance) throws Exception {
    if (!instance.classIsMissing()) {

        if (m_Classifier == null)
            throw new Exception("No base classifier has been set!");

        // not sure on how to properly check if m_CostMatrix has already been fully intialized here or from elsewhere (ie. external call to buildClassifier)
        if (m_CostMatrix is null || (m_CostMatrix.size() == 1 && !classifierAlreadyUpdated)) {
            buildClassifier(new Instances[] {instance}); // re-use intialization process from buildClassifier
            classifierAlreadyUpdated = True;
        }
        else {
            double factor = 1.0;
            int classValIndex = (int) instance.classValue();
            Object element = (classValIndex == 0) ? m_CostMatrix.getCell(classValIndex, 1) : m_CostMatrix.getCell(classValIndex, 0);

            if (element instanceof Double) {
                factor = ((Double) element).doubleValue();
            } else {
                factor = ((AttributeExpression) element).evaluateExpression(instance);
            }

            double weightOfInstance = instance.weight() * factor;

            if (!m_MinimizeExpectedCost) {
                ((UpdateableClassifier)m_Classifier).updateClassifier(instance.setWeight(weightOfInstance));
            } else {
                ((UpdateableClassifier)m_Classifier).updateClassifier(instance);
            }
        }
    }
}
/*
   weka.classifiers.meta.ThresholdSelector draft code update and questions to make it compatible with updateable classifiers

   I've got the big picture but I would need some help on findThreshold and the evaluation mode

   findThreshold:
       double low, high, maxValue and Instance maxInst => should become protected class properties in order
       to keep them updated across build&all updates and could be resetted when calling buildClassifier

   Evaluation mode and getPredictions: should I create a new Evaluation mode ?
   EVAL_TRAINING_SET does not seem a good option as it would skip the updateClassifier

   I could then modify toString and add the code below to getPredictions ?
     case EVAL_STREAM:
       return eu.getTrainTestPredictions(m_Classifier, instances, instances);

   For updateClassifier, please find below a draft code
*/

import weka.classifiers.UpdateableClassifier;
....
implements ... UpdateableClassifier;
...
protected boolean classifierAlreadyUpdated = False;

public void updateClassifier(Instance instance) throws Exception {
    if (!instance.classIsMissing()) {

        if (m_Classifier == null)
            throw new Exception("No base classifier has been set!");

        // Don't know how to properly check if m_CostMatrix has already been fully intialized here or from elsewhere
        if (!classifierAlreadyUpdated)) {
            buildClassifier(new Instances[] {instance}); // re-use intialization process from buildClassifier
            classifierAlreadyUpdated = True;
        }
        else {

            // If data contains only one instance of positive data
            // optimize on training data
            if (stats.distinctCount != 2) {
                System.err.println("Couldn't find examples of both classes. No adjustment.");
                m_Classifier.updateClassifier(instance);
            }
            else {
                // m_DesignatedClass: already initialized via buildClassifier (called if needed during first update)

                if (m_manualThreshold) {
                    m_Classifier.updateClassifier(instance);
                    return;
                }

                if (stats.nominalCounts[m_DesignatedClass] == 1) {
                    System.err.println("Only 1 positive found: optimizing on training data");
                    findThreshold(getPredictions(new Instances[] {instance}, EVAL_TRAINING_SET, 0));
                } else {
                    int numFolds = Math.min(m_NumXValFolds, stats.nominalCounts[m_DesignatedClass]);
                    findThreshold(getPredictions(new Instances[] {instance}, m_EvalMode, numFolds));
                    if (m_EvalMode != EVAL_TRAINING_SET) {
                        m_Classifier.updateClassifier(instance);
                }
            }
        }
    }
}