有没有办法将Weka j48决策树输出映射到RDF格式?

有没有办法将Weka j48决策树输出映射到RDF格式?,rdf,weka,jena,ontology,decision-tree,Rdf,Weka,Jena,Ontology,Decision Tree,我想使用基于Weka j48决策树输出的Jena创建一个本体。但在将其输入到Jena之前,需要将该输出映射到RDF格式。有什么方法可以进行这种映射吗 EDIT1: 映射前j48决策树输出的示例部分: 决策树输出对应的RDF样本部分: 这两个屏幕来自本研究论文(幻灯片4): 可能没有内置的方法来实现这一点 免责声明:我以前从未与Jena和RDF合作过。因此,这个答案可能不完整,或者没有达到预期转换的目的 但无论如何,首先是一句简短的咆哮: 论文中发表的代码片段(即Weka分类器和RDF的

我想使用基于Weka j48决策树输出的Jena创建一个本体。但在将其输入到Jena之前,需要将该输出映射到RDF格式。有什么方法可以进行这种映射吗

EDIT1:

映射前j48决策树输出的示例部分:

决策树输出对应的RDF样本部分:

这两个屏幕来自本研究论文(幻灯片4):


可能没有内置的方法来实现这一点

免责声明:我以前从未与Jena和RDF合作过。因此,这个答案可能不完整,或者没有达到预期转换的目的

但无论如何,首先是一句简短的咆哮:


论文中发表的代码片段(即Weka分类器和RDF的输出)不完整且明显不一致。转换过程根本没有描述。相反,他们只提到:

我们面临的挑战主要是将J48分类输出给RDF,并将其交给Jena

(原文如此!)

现在,他们设法解决了这个问题。他们本可以在公开的开源存储库中提供转换代码。这将允许其他人提供改进,并将提高其方法的可见性和可验证性。但是,相反,他们浪费了时间和读者的时间,用各种网站的截图作为页面填充,可怜地试图从他们的方法中挤出另一份出版物


以下是我尽力提供转换所需的一些构建块的方法。必须对它持保留态度,因为我不熟悉底层的方法和库。不过,我希望它可以被视为“有用”

Weka
分类器
实现通常不提供用于内部工作的结构。因此,不可能直接访问内部树结构。但是,有一个方法返回树的字符串表示形式

下面的代码包含一个非常实用的方法(因此有些脆弱),该方法解析这个字符串并构建一个包含相关信息的树结构。此结构由
TreeNode
对象组成:

static class TreeNode
{
    String label;
    String attribute;
    String relation;
    String value;
    ...
}
  • 标签
    是用于分类器的类标签。对于叶节点,这仅为非空。对于本文中的示例,这将是
    “0”
    “1”
    ,指示电子邮件是否为垃圾邮件

  • 属性是决策所基于的属性。对于本文中的示例,这样的属性可以是
    word\u freq\u remove


  • 关系
    是表示决策标准的字符串。这些可能是
    “您能否提供一个示例,说明您的预期输出RDF应该是什么样子?@Marco13请检查编辑。1)您需要本体,即模式2)为Weka编写您自己的导出程序,很明显,没有内置的或3)编写一个从决策树字符串到RDF的转换器。或者先导出/转换为XML或JSON。很抱歉响应太晚。非常感谢你!非常感谢您的努力。@MohamedELTair您不必接受答案。也许有人能提供一个更完整的解决方案。还是你现在就完全解决了?我没有机会在Jena中尝试RDF输出,所以我仍然怀疑它是否是一种真正合理的格式……我对RDF是新手。但据我所知,输出RDF的格式在我看来是正确的。我运行了整个代码,它按预期工作。现在我需要在RDF上运行查询来对测试数据进行分类。因此,我很容易创建一个查询函数来对RDF运行查询,毕竟它就像一棵树。再次感谢。顺便说一句,我已经使用以下链接检查了代码生成的多个RDF:。所有这些都是有效的,并且三元组和图形都成功生成。@MohamedELTair感谢您的反馈。我发贴这封信是在冒险。所以很高兴听到你确实觉得它很有用。
    
    import java.io.FileInputStream;
    import java.util.ArrayList;
    import java.util.List;
    
    import org.apache.jena.rdf.model.Model;
    import org.apache.jena.rdf.model.ModelFactory;
    import org.apache.jena.rdf.model.Property;
    import org.apache.jena.rdf.model.Resource;
    import org.apache.jena.rdf.model.Statement;
    
    import weka.classifiers.trees.J48;
    import weka.core.Instances;
    import weka.core.converters.ArffLoader;
    
    public class WekaClassifierToRdf
    {
        public static void main(String[] args) throws Exception
        {
            String fileName = "./data/iris.arff";
            ArffLoader arffLoader = new ArffLoader();
            arffLoader.setSource(new FileInputStream(fileName));
            Instances instances = arffLoader.getDataSet();
            instances.setClassIndex(4);
            //System.out.println(instances);
    
            J48 classifier = new J48();
            classifier.buildClassifier(instances);
    
            System.out.println(classifier);
    
            String prefixTreeString = classifier.prefix();
            TreeNode node = processPrefixTreeString(prefixTreeString);
    
            System.out.println("Tree:");
            System.out.println(node.createString());
    
            Model model = createModel(node);
    
            System.out.println("Model:");
            model.write(System.out, "RDF/XML-ABBREV");
        }
    
        private static TreeNode processPrefixTreeString(String inputString)
        {
            String string = inputString.replaceAll("\\n", "");
    
            //System.out.println("Input is " + string);
    
            int open = string.indexOf("[");
            int close = string.lastIndexOf("]");
            String part = string.substring(open + 1, close);
    
            //System.out.println("Part " + part);
    
            int colon = part.indexOf(":");
            if (colon == -1)
            {
                TreeNode node = new TreeNode();
    
                int openAfterLabel = part.lastIndexOf("(");
                String label = part.substring(0, openAfterLabel).trim();
                node.label = label;
                return node;
            }
    
            String attributeName = part.substring(0, colon);
    
            //System.out.println("attributeName " + attributeName);
    
            int comma = part.indexOf(",", colon);
    
            int leftOpen = part.indexOf("[", comma);
    
            String leftCondition = part.substring(colon + 1, comma).trim();
            String rightCondition = part.substring(comma + 1, leftOpen).trim();
    
            int leftSpace = leftCondition.indexOf(" ");
            String leftRelation = leftCondition.substring(0, leftSpace).trim();
            String leftValue = leftCondition.substring(leftSpace + 1).trim();
    
            int rightSpace = rightCondition.indexOf(" ");
            String rightRelation = rightCondition.substring(0, rightSpace).trim();
            String rightValue = rightCondition.substring(rightSpace + 1).trim();
    
            //System.out.println("leftCondition " + leftCondition);
            //System.out.println("rightCondition " + rightCondition);
    
            int leftClose = findClosing(part, leftOpen + 1);
            String left = part.substring(leftOpen, leftClose + 1);
    
            //System.out.println("left " + left);
    
            int rightOpen = part.indexOf("[", leftClose);
            int rightClose = findClosing(part, rightOpen + 1);
            String right = part.substring(rightOpen, rightClose + 1);
    
            //System.out.println("right " + right);
    
            TreeNode leftNode = processPrefixTreeString(left);
            leftNode.relation = leftRelation;
            leftNode.value = leftValue;
    
            TreeNode rightNode = processPrefixTreeString(right);
            rightNode.relation = rightRelation;
            rightNode.value = rightValue;
    
            TreeNode result = new TreeNode();
            result.attribute = attributeName;
            result.children.add(leftNode);
            result.children.add(rightNode);
            return result;
    
        }
    
        private static int findClosing(String string, int startIndex)
        {
            int stack = 0;
            for (int i=startIndex; i<string.length(); i++)
            {
                char c = string.charAt(i);
                if (c == '[')
                {
                    stack++;
                }
                if (c == ']')
                {
                    if (stack == 0)
                    {
                        return i;
                    }
                    stack--;
                }
            }
            return -1;
        }
    
        static class TreeNode
        {
            String label;
            String attribute;
            String relation;
            String value;
            List<TreeNode> children = new ArrayList<TreeNode>();
    
            String createString()
            {
                StringBuilder sb = new StringBuilder();
                createString("", sb);
                return sb.toString();
            }
    
            private void createString(String indent, StringBuilder sb)
            {
                if (children.isEmpty())
                {
                    sb.append(indent + label);
                }
                sb.append("\n");
                for (TreeNode child : children)
                {
                    sb.append(indent + "if " + attribute + " " + child.relation
                        + " " + child.value + ": ");
                    child.createString(indent + "  ", sb);
                }
            }
    
            @Override
            public String toString()
            {
                return "TreeNode [label=" + label + ", attribute=" + attribute
                    + ", relation=" + relation + ", value=" + value + "]";
            }
        }    
    
        private static String createPropertyString(TreeNode node)
        {
            if ("<".equals(node.relation))
            {
                return "lt_" + node.value;
            }
            if ("<=".equals(node.relation))
            {
                return "lte_" + node.value;
            }
            if (">".equals(node.relation))
            {
                return "gt_" + node.value;
            }
            if (">=".equals(node.relation))
            {
                return "gte_" + node.value;
            }
            System.err.println("Unknown relation: " + node.relation);
            return "UNKNOWN";
        }    
    
        static Model createModel(TreeNode node)
        {
            Model model = ModelFactory.createDefaultModel();
    
            String baseUri = "http://www.example.com/example#";
            model.createResource(baseUri);
            model.setNsPrefix("base", baseUri);
            populateModel(model, baseUri, node, node.attribute);
            return model;
        }
    
        private static void populateModel(Model model, String baseUri,
            TreeNode node, String resourceName)
        {
            //System.out.println("Populate with " + resourceName);
    
            for (TreeNode child : node.children)
            {
                if (child.label != null)
                {
                    Resource resource =
                        model.createResource(baseUri + resourceName);
                    String propertyString = createPropertyString(child);
                    Property property =
                        model.createProperty(baseUri, propertyString);
                    Statement statement = model.createLiteralStatement(resource,
                        property, child.label);
                    model.add(statement);
                }
                else
                {
                    Resource resource =
                        model.createResource(baseUri + resourceName);
                    String propertyString = createPropertyString(child);
                    Property property =
                        model.createProperty(baseUri, propertyString);
    
                    String nextResourceName = resourceName + "_" + child.attribute;
                    Resource childResource =
                        model.createResource(baseUri + nextResourceName);
                    Statement statement =
                        model.createStatement(resource, property, childResource);
                    model.add(statement);
                }
            }
            for (TreeNode child : node.children)
            {
                String nextResourceName = resourceName + "_" + child.attribute;
                populateModel(model, baseUri, child, nextResourceName);
            }
        }
    
    }
    
    J48 pruned tree
    ------------------
    
    petalwidth <= 0.6: Iris-setosa (50.0)
    petalwidth > 0.6
    |   petalwidth <= 1.7
    |   |   petallength <= 4.9: Iris-versicolor (48.0/1.0)
    |   |   petallength > 4.9
    |   |   |   petalwidth <= 1.5: Iris-virginica (3.0)
    |   |   |   petalwidth > 1.5: Iris-versicolor (3.0/1.0)
    |   petalwidth > 1.7: Iris-virginica (46.0/1.0)
    
    Number of Leaves  :     5
    
    Size of the tree :     9
    
    Tree:
    
    if petalwidth <= 0.6:   Iris-setosa
    if petalwidth > 0.6: 
      if petalwidth <= 1.7: 
        if petallength <= 4.9:       Iris-versicolor
        if petallength > 4.9: 
          if petalwidth <= 1.5:         Iris-virginica
          if petalwidth > 1.5:         Iris-versicolor
      if petalwidth > 1.7:     Iris-virginica
    
    Model:
    <rdf:RDF
        xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
        xmlns:base="http://www.example.com/example#">
      <rdf:Description rdf:about="http://www.example.com/example#petalwidth">
        <base:gt_0.6>
          <rdf:Description rdf:about="http://www.example.com/example#petalwidth_petalwidth">
            <base:gt_1.7>Iris-virginica</base:gt_1.7>
            <base:lte_1.7>
              <rdf:Description rdf:about="http://www.example.com/example#petalwidth_petalwidth_petallength">
                <base:gt_4.9>
                  <rdf:Description rdf:about="http://www.example.com/example#petalwidth_petalwidth_petallength_petalwidth">
                    <base:gt_1.5>Iris-versicolor</base:gt_1.5>
                    <base:lte_1.5>Iris-virginica</base:lte_1.5>
                  </rdf:Description>
                </base:gt_4.9>
                <base:lte_4.9>Iris-versicolor</base:lte_4.9>
              </rdf:Description>
            </base:lte_1.7>
          </rdf:Description>
        </base:gt_0.6>
        <base:lte_0.6>Iris-setosa</base:lte_0.6>
      </rdf:Description>
    </rdf:RDF>