Stream 基于流学习的Weka阈值选择器和CostSensitiveClassifier_Stream_Machine Learning_Weka

Stream 基于流学习的Weka阈值选择器和CostSensitiveClassifier

stream machine-learning

Stream 基于流学习的Weka阈值选择器和CostSensitiveClassifier,stream,machine-learning,weka,Stream,Machine Learning,Weka,Weka的ThresholdSelector和/或CostSensitiveClassifier是否与流学习（可更新分类器）兼容？我的目标是将它们与weka.classifiers.meta.MOA一起使用，以便将学习重点放在特定的类上，并将某些不平衡数据上的FN最小化非常感谢下面的答案是，或阈值选择器或CostSensitiveClassifier都不支持可更新的分类器。因此，目前不可能使用这些元分类器进行流式学习因此，我提出了一个代码草案来创建这些分类器的可更新版本。欢迎提出任何意见/

Weka的ThresholdSelector和/或CostSensitiveClassifier是否与流学习（可更新分类器）兼容？我的目标是将它们与weka.classifiers.meta.MOA一起使用，以便将学习重点放在特定的类上，并将某些不平衡数据上的FN最小化

非常感谢

下面的答案是，或阈值选择器或CostSensitiveClassifier都不支持可更新的分类器。因此，目前不可能使用这些元分类器进行流式学习
因此，我提出了一个代码草案来创建这些分类器的可更新版本。欢迎提出任何意见/建议
weka.classifiers.meta.CostSensitiveClassifier代码更新以创建可更新的版本（此版本“似乎”最简单）
weka.classifiers.meta.ThresholdSelector代码更新以创建可更新版本（等待您的评论/建议）：
谢谢

/* weka.classifiers.meta.CostSensitiveClassifier: draft code update and questions to make it compatible with updateable classifiers */ import weka.classifiers.UpdateableClassifier; .... implements ... UpdateableClassifier; ... protected boolean classifierAlreadyUpdated = False; public void updateClassifier(Instance instance) throws Exception { if (!instance.classIsMissing()) { if (m_Classifier == null) throw new Exception("No base classifier has been set!"); // not sure on how to properly check if m_CostMatrix has already been fully intialized here or from elsewhere (ie. external call to buildClassifier) if (m_CostMatrix is null || (m_CostMatrix.size() == 1 && !classifierAlreadyUpdated)) { buildClassifier(new Instances[] {instance}); // re-use intialization process from buildClassifier classifierAlreadyUpdated = True; } else { double factor = 1.0; int classValIndex = (int) instance.classValue(); Object element = (classValIndex == 0) ? m_CostMatrix.getCell(classValIndex, 1) : m_CostMatrix.getCell(classValIndex, 0); if (element instanceof Double) { factor = ((Double) element).doubleValue(); } else { factor = ((AttributeExpression) element).evaluateExpression(instance); } double weightOfInstance = instance.weight() * factor; if (!m_MinimizeExpectedCost) { ((UpdateableClassifier)m_Classifier).updateClassifier(instance.setWeight(weightOfInstance)); } else { ((UpdateableClassifier)m_Classifier).updateClassifier(instance); } } } }

/* weka.classifiers.meta.ThresholdSelector draft code update and questions to make it compatible with updateable classifiers I've got the big picture but I would need some help on findThreshold and the evaluation mode findThreshold: double low, high, maxValue and Instance maxInst => should become protected class properties in order to keep them updated across build&all updates and could be resetted when calling buildClassifier Evaluation mode and getPredictions: should I create a new Evaluation mode ? EVAL_TRAINING_SET does not seem a good option as it would skip the updateClassifier I could then modify toString and add the code below to getPredictions ? case EVAL_STREAM: return eu.getTrainTestPredictions(m_Classifier, instances, instances); For updateClassifier, please find below a draft code */ import weka.classifiers.UpdateableClassifier; .... implements ... UpdateableClassifier; ... protected boolean classifierAlreadyUpdated = False; public void updateClassifier(Instance instance) throws Exception { if (!instance.classIsMissing()) { if (m_Classifier == null) throw new Exception("No base classifier has been set!"); // Don't know how to properly check if m_CostMatrix has already been fully intialized here or from elsewhere if (!classifierAlreadyUpdated)) { buildClassifier(new Instances[] {instance}); // re-use intialization process from buildClassifier classifierAlreadyUpdated = True; } else { // If data contains only one instance of positive data // optimize on training data if (stats.distinctCount != 2) { System.err.println("Couldn't find examples of both classes. No adjustment."); m_Classifier.updateClassifier(instance); } else { // m_DesignatedClass: already initialized via buildClassifier (called if needed during first update) if (m_manualThreshold) { m_Classifier.updateClassifier(instance); return; } if (stats.nominalCounts[m_DesignatedClass] == 1) { System.err.println("Only 1 positive found: optimizing on training data"); findThreshold(getPredictions(new Instances[] {instance}, EVAL_TRAINING_SET, 0)); } else { int numFolds = Math.min(m_NumXValFolds, stats.nominalCounts[m_DesignatedClass]); findThreshold(getPredictions(new Instances[] {instance}, m_EvalMode, numFolds)); if (m_EvalMode != EVAL_TRAINING_SET) { m_Classifier.updateClassifier(instance); } } } } }