Machine learning 随机砍伐森林模型评估策略(混淆矩阵、准确度和精确度\u召回\u fscore

Machine learning 随机砍伐森林模型评估策略(混淆矩阵、准确度和精确度\u召回\u fscore,machine-learning,random-forest,amazon-sagemaker,Machine Learning,Random Forest,Amazon Sagemaker,我正在使用AWS sagemker随机砍伐森林算法来检测异常 import boto3 import sagemaker containers = { 'us-west-2': '174872318107.dkr.ecr.us-west-2.amazonaws.com/randomcutforest:latest', 'us-east-1': '382416733822.dkr.ecr.us-east-1.amazonaws.com/randomcutforest:latest

我正在使用AWS sagemker随机砍伐森林算法来检测异常

import boto3
import sagemaker

containers = {
    'us-west-2': '174872318107.dkr.ecr.us-west-2.amazonaws.com/randomcutforest:latest',
    'us-east-1': '382416733822.dkr.ecr.us-east-1.amazonaws.com/randomcutforest:latest',
    'us-east-2': '404615174143.dkr.ecr.us-east-2.amazonaws.com/randomcutforest:latest',
    'eu-west-1': '438346466558.dkr.ecr.eu-west-1.amazonaws.com/randomcutforest:latest',
    'ap-southeast-1':'475088953585.dkr.ecr.ap-southeast-1.amazonaws.com/randomcutforest:latest'
    }
region_name = boto3.Session().region_name
container = containers[region_name]

session = sagemaker.Session()

rcf = sagemaker.estimator.Estimator(
    container,
    sagemaker.get_execution_role(),
    output_path='s3://{}/{}/output'.format(bucket, prefix),
    train_instance_count=1,
    train_instance_type='ml.c5.xlarge',
    sagemaker_session=session)

rcf.set_hyperparameters(
    num_samples_per_tree=200,
    num_trees=250,
    feature_dim=1,
    eval_metrics =["accuracy", "precision_recall_fscore"])

s3_train_input = sagemaker.session.s3_input(
    s3_train_data,
    distribution='ShardedByS3Key',
    content_type='application/x-recordio-protobuf')

rcf.fit({'train': s3_train_input})
(引用自-->)

使用上述代码来训练模型,没有找到评估模型的方法。
部署模型后如何获得准确度和F分数。

为了获得评估指标,您需要在培训期间提供一个名为“测试”的额外通道。测试通道必须包含标记数据。官方文档中对此进行了解释:

Amazon SageMaker Random Cut Forest支持训练和测试数据通道。可选的测试通道用于计算标记数据的准确性、精确度、召回率和F1分数指标。训练和测试数据内容类型可以是应用程序/x-recordio-protobuf或文本/csv格式。对于测试数据,当使用文本/csv格式时,content必须指定为text/csv;label_size=1,其中每行的第一列表示异常标签:“1”表示异常数据点,“0”表示正常数据点。您可以使用文件模式或管道模式在格式化为recordIO wrapped protobuf或csv的数据上训练RCF模型

还要注意…测试通道仅支持S3DataDistributionType=FullyReplicated

谢谢

胡里奥