Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/347.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
将python scikit学习模型导出到pmml中_Python_Scikit Learn_Pmml - Fatal编程技术网

将python scikit学习模型导出到pmml中

将python scikit学习模型导出到pmml中,python,scikit-learn,pmml,Python,Scikit Learn,Pmml,我想将python scikit学习模型导出到PMML中 哪种python包最适合 我读过,但没有找到任何使用scikit学习模型的示例。是 围绕JPMML SkLearn命令行应用程序的薄型包装。有关支持的Scikit学习估计器和转换器类型的列表,请参阅JPMML SkLearn项目的文档 正如@user1808924所指出的,它支持Python 2.7或3.4+。它还需要Java1.7+ 通过以下方式安装:(需要) 如何将分类器树导出到PMML的示例。首先生长该树: # example tr

我想将python scikit学习模型导出到PMML中

哪种python包最适合

我读过,但没有找到任何使用scikit学习模型的示例。

围绕JPMML SkLearn命令行应用程序的薄型包装。有关支持的Scikit学习估计器和转换器类型的列表,请参阅JPMML SkLearn项目的文档

正如@user1808924所指出的,它支持Python 2.7或3.4+。它还需要Java1.7+

通过以下方式安装:(需要)

如何将分类器树导出到PMML的示例。首先生长该树:

# example tree & viz from http://scikit-learn.org/stable/modules/tree.html
from sklearn import datasets, tree
iris = datasets.load_iris()
clf = tree.DecisionTreeClassifier() 
clf = clf.fit(iris.data, iris.target)
SkLearn2PMML转换分为两个部分,一个估计器(我们的
clf
)和一个映射器(用于离散化或PCA等预处理步骤)。我们的映射器非常基本,因为我们没有进行任何转换

from sklearn_pandas import DataFrameMapper
default_mapper = DataFrameMapper([(i, None) for i in iris.feature_names + ['Species']])

from sklearn2pmml import sklearn2pmml
sklearn2pmml(estimator=clf, 
             mapper=default_mapper, 
             pmml="D:/workspace/IrisClassificationTree.pmml")
有可能(尽管没有记录)传递
mapper=None
,但您会看到预测器名称丢失(返回
x1
not
sepal length
等)

让我们看看
.pmml
文件:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PMML xmlns="http://www.dmg.org/PMML-4_3" version="4.3">
    <Header>
        <Application name="JPMML-SkLearn" version="1.1.1"/>
        <Timestamp>2016-09-26T19:21:43Z</Timestamp>
    </Header>
    <DataDictionary>
        <DataField name="sepal length (cm)" optype="continuous" dataType="float"/>
        <DataField name="sepal width (cm)" optype="continuous" dataType="float"/>
        <DataField name="petal length (cm)" optype="continuous" dataType="float"/>
        <DataField name="petal width (cm)" optype="continuous" dataType="float"/>
        <DataField name="Species" optype="categorical" dataType="string">
            <Value value="setosa"/>
            <Value value="versicolor"/>
            <Value value="virginica"/>
        </DataField>
    </DataDictionary>
    <TreeModel functionName="classification" splitCharacteristic="binarySplit">
        <MiningSchema>
            <MiningField name="Species" usageType="target"/>
            <MiningField name="sepal length (cm)"/>
            <MiningField name="sepal width (cm)"/>
            <MiningField name="petal length (cm)"/>
            <MiningField name="petal width (cm)"/>
        </MiningSchema>
        <Output>
            <OutputField name="probability_setosa" dataType="double" feature="probability" value="setosa"/>
            <OutputField name="probability_versicolor" dataType="double" feature="probability" value="versicolor"/>
            <OutputField name="probability_virginica" dataType="double" feature="probability" value="virginica"/>
        </Output>
        <Node id="1">
            <True/>
            <Node id="2" score="setosa" recordCount="50.0">
                <SimplePredicate field="petal width (cm)" operator="lessOrEqual" value="0.8"/>
                <ScoreDistribution value="setosa" recordCount="50.0"/>
                <ScoreDistribution value="versicolor" recordCount="0.0"/>
                <ScoreDistribution value="virginica" recordCount="0.0"/>
            </Node>
            <Node id="3">
                <SimplePredicate field="petal width (cm)" operator="greaterThan" value="0.8"/>
                <Node id="4">
                    <SimplePredicate field="petal width (cm)" operator="lessOrEqual" value="1.75"/>
                    <Node id="5">
                        <SimplePredicate field="petal length (cm)" operator="lessOrEqual" value="4.95"/>
                        <Node id="6" score="versicolor" recordCount="47.0">
                            <SimplePredicate field="petal width (cm)" operator="lessOrEqual" value="1.6500001"/>
                            <ScoreDistribution value="setosa" recordCount="0.0"/>
                            <ScoreDistribution value="versicolor" recordCount="47.0"/>
                            <ScoreDistribution value="virginica" recordCount="0.0"/>
                        </Node>
                        <Node id="7" score="virginica" recordCount="1.0">
                            <SimplePredicate field="petal width (cm)" operator="greaterThan" value="1.6500001"/>
                            <ScoreDistribution value="setosa" recordCount="0.0"/>
                            <ScoreDistribution value="versicolor" recordCount="0.0"/>
                            <ScoreDistribution value="virginica" recordCount="1.0"/>
                        </Node>
                    </Node>
                    <Node id="8">
                        <SimplePredicate field="petal length (cm)" operator="greaterThan" value="4.95"/>
                        <Node id="9" score="virginica" recordCount="3.0">
                            <SimplePredicate field="petal width (cm)" operator="lessOrEqual" value="1.55"/>
                            <ScoreDistribution value="setosa" recordCount="0.0"/>
                            <ScoreDistribution value="versicolor" recordCount="0.0"/>
                            <ScoreDistribution value="virginica" recordCount="3.0"/>
                        </Node>
                        <Node id="10">
                            <SimplePredicate field="petal width (cm)" operator="greaterThan" value="1.55"/>
                            <Node id="11" score="versicolor" recordCount="2.0">
                                <SimplePredicate field="sepal length (cm)" operator="lessOrEqual" value="6.95"/>
                                <ScoreDistribution value="setosa" recordCount="0.0"/>
                                <ScoreDistribution value="versicolor" recordCount="2.0"/>
                                <ScoreDistribution value="virginica" recordCount="0.0"/>
                            </Node>
                            <Node id="12" score="virginica" recordCount="1.0">
                                <SimplePredicate field="sepal length (cm)" operator="greaterThan" value="6.95"/>
                                <ScoreDistribution value="setosa" recordCount="0.0"/>
                                <ScoreDistribution value="versicolor" recordCount="0.0"/>
                                <ScoreDistribution value="virginica" recordCount="1.0"/>
                            </Node>
                        </Node>
                    </Node>
                </Node>
                <Node id="13">
                    <SimplePredicate field="petal width (cm)" operator="greaterThan" value="1.75"/>
                    <Node id="14">
                        <SimplePredicate field="petal length (cm)" operator="lessOrEqual" value="4.8500004"/>
                        <Node id="15" score="virginica" recordCount="2.0">
                            <SimplePredicate field="sepal width (cm)" operator="lessOrEqual" value="3.1"/>
                            <ScoreDistribution value="setosa" recordCount="0.0"/>
                            <ScoreDistribution value="versicolor" recordCount="0.0"/>
                            <ScoreDistribution value="virginica" recordCount="2.0"/>
                        </Node>
                        <Node id="16" score="versicolor" recordCount="1.0">
                            <SimplePredicate field="sepal width (cm)" operator="greaterThan" value="3.1"/>
                            <ScoreDistribution value="setosa" recordCount="0.0"/>
                            <ScoreDistribution value="versicolor" recordCount="1.0"/>
                            <ScoreDistribution value="virginica" recordCount="0.0"/>
                        </Node>
                    </Node>
                    <Node id="17" score="virginica" recordCount="43.0">
                        <SimplePredicate field="petal length (cm)" operator="greaterThan" value="4.8500004"/>
                        <ScoreDistribution value="setosa" recordCount="0.0"/>
                        <ScoreDistribution value="versicolor" recordCount="0.0"/>
                        <ScoreDistribution value="virginica" recordCount="43.0"/>
                    </Node>
                </Node>
            </Node>
        </Node>
    </TreeModel>
</PMML>

请随意尝试Nyoka。导出SKL模型,然后导出一些。

是一个python库,支持
Scikit learn
XGBoost
LightGBM
Keras
Statsmodels

除了大约500个Python类,每个类都包含一个PMML标记和标准中定义的所有构造函数参数/属性,Nyoka还提供了越来越多的方便类和函数,使数据科学家的生活更加轻松,例如,通过从您最喜欢的Python环境中读取或写入一行代码中的任何PMML文件

可通过以下方式从PyPi安装:

pip install nyoka
示例代码 例1
将熊猫作为pd导入
从sklearn导入数据集
从sklearn.pipeline导入管道
从sklearn.preprocessing导入标准定标器、输入器
从sklearn\u导入数据帧映射器
从sklearn.employ导入随机林分类器
iris=数据集。加载\u iris()
irisd=pd.DataFrame(iris.data,columns=iris.feature\u name)
irisd['Species']=iris.target
features=irisd.columns.drop('Species')
目标=‘物种’
管道_obj=管道([
(“映射”,DataFrameMapper([
([‘萼片长度(厘米)’,‘萼片宽度(厘米)’,StandardScaler()),
([‘花瓣长度(厘米)’,‘花瓣宽度(厘米)’,插补器())
])),
(“rfc”,随机森林分类器(n_估计值=100))
])
管道匹配(irisd[特征]、irisd[目标])
从nyoka导入skl_到pmml
skl到pmml(管道对象、特性、目标,“rf\u pmml.pmml”)
例2 来自keras导入应用程序的

从keras.layers导入平坦、致密
从keras.models导入模型
model=applications.MobileNet(weights='imagenet',include_top=False,input_shape=(224224,3))
activType='sigmoid'
x=模型输出
x=展平()(x)
x=密集(1024,activation=“relu”)(x)
预测=密集(2,激活=激活类型)(x)
模型\最终=模型(输入=模型.输入,输出=预测,名称='预测')
来自nyoka import KerasToPmml
cnn_pmml=KerasToPmml(模型_最终版,数据集='image',预测类=['cats','dogs']))
cnn_pmml.export(开放('2classMBNet.pmml',“w”),0)

更多示例可以在中找到。

当不使用映射器时,是否有办法保留预测器名称?我真的需要在计算器端了解它们,但是仅仅为此构建映射器太过分了。@K没有映射器,我想不出如何保存预测器名称。您可以尝试发布该问题。答案似乎已过时:
sklearn2pmml
现在使用
PMMLPipeline
。您可以使用该软件包将Scikit学习模型和转换器转换为PMML。jpmml sklearn软件包支持python 3.4。是否有支持Python2.7JPMML的替代方案?SkLearn也支持Python2.7,但目前尚未公布。
from sklearn.externals.six import StringIO
import pydotplus # this might be pydot for python 2.7
dot_data = StringIO() 
tree.export_graphviz(clf, 
                     out_file=dot_data,  
                     feature_names=iris.feature_names,  
                     class_names=iris.target_names,  
                     filled=True, rounded=True,  
                     special_characters=True) 
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
graph.write_pdf("D:/workspace/iris.pdf") 
# for in-line display, you can also do:
# from IPython.display import Image  
# Image(graph.create_png())  
pip install nyoka