我可以使用Augustus(Python)应用包含DefineFunction的PMML模型吗?
我使用Augustus作为PMML模型消费者。我修改了,以包含DefineFunction元素,如下所示:我可以使用Augustus(Python)应用包含DefineFunction的PMML模型吗?,python,pmml,Python,Pmml,我使用Augustus作为PMML模型消费者。我修改了,以包含DefineFunction元素,如下所示: <PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1"> <Header/> <DataDictionary> <DataField name="x" dataType="double" optype="continuous"/> <
<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1">
<Header/>
<DataDictionary>
<DataField name="x" dataType="double" optype="continuous"/>
<DataField name="y" dataType="double" optype="continuous"/>
</DataDictionary>
<TransformationDictionary>
<DefineFunction dataType="float" optype="continuous" name="add">
<ParameterField optype="continuous" name="first"></ParameterField>
<ParameterField optype="continuous" name="second"></ParameterField>
<Apply function="+" invalidValueTreatment="returnInvalid">
<FieldRef field="first"></FieldRef>
<FieldRef field="second"></FieldRef>
</Apply>
</DefineFunction>
<DerivedField name="z" dataType="double" optype="continuous">
<Apply function="add">
<FieldRef field="x"/>
<FieldRef field="y"/>
</Apply>
</DerivedField>
</TransformationDictionary>
</PMML>
# Monkey-patch augustus
import augustus.pmml.DefineFunction
def _setupCalculate(self, dataTable, functionTable, performanceTable):
return (dataTable, functionTable, performanceTable)
augustus.pmml.DefineFunction.DefineFunction._setupCalculate = _setupCalculate
# Now the actual script
from augustus.strict import modelLoader
# Load model
add_two_numbers_file = 'addTwoNumbers.pmml'
with open(add_two_numbers_file, 'r') as model_file:
model_str = model_file.read()
model = modelLoader.loadXml(model_str)
# Run model
print model.calc({'x':[1,2,3],'y':[4,5,6]}).look()
但是,我得到一个错误:
AttributeError: 'DefineFunction' object has no attribute '_setupCalculate'
我使用的是最新的trunk(修订版794),能够毫无问题地运行未修改的示例(没有DefineFunction)。DefineFunction是否得到Augustus的支持?我可以通过做两个更改来解决这个问题。在查看了augustus源代码并确定,
\u setupCalculate
确实没有在任何地方定义之后,我用monkey修补了它。我的脚本现在如下所示:
<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1">
<Header/>
<DataDictionary>
<DataField name="x" dataType="double" optype="continuous"/>
<DataField name="y" dataType="double" optype="continuous"/>
</DataDictionary>
<TransformationDictionary>
<DefineFunction dataType="float" optype="continuous" name="add">
<ParameterField optype="continuous" name="first"></ParameterField>
<ParameterField optype="continuous" name="second"></ParameterField>
<Apply function="+" invalidValueTreatment="returnInvalid">
<FieldRef field="first"></FieldRef>
<FieldRef field="second"></FieldRef>
</Apply>
</DefineFunction>
<DerivedField name="z" dataType="double" optype="continuous">
<Apply function="add">
<FieldRef field="x"/>
<FieldRef field="y"/>
</Apply>
</DerivedField>
</TransformationDictionary>
</PMML>
# Monkey-patch augustus
import augustus.pmml.DefineFunction
def _setupCalculate(self, dataTable, functionTable, performanceTable):
return (dataTable, functionTable, performanceTable)
augustus.pmml.DefineFunction.DefineFunction._setupCalculate = _setupCalculate
# Now the actual script
from augustus.strict import modelLoader
# Load model
add_two_numbers_file = 'addTwoNumbers.pmml'
with open(add_two_numbers_file, 'r') as model_file:
model_str = model_file.read()
model = modelLoader.loadXml(model_str)
# Run model
print model.calc({'x':[1,2,3],'y':[4,5,6]}).look()
我天真地假设\u setupCalculate
不需要做任何重要的事情。我现在得到了一个不同的、更难以理解的错误:
ValueError: assignment destination is read-only
排队
mask[mask2] = defs.MISSING
在FieldType.py中。经过几次调试之后,我发现这一行仅在类型转换期间执行,并注意到我在PMML中同时使用了float和double类型。通过删除不必要的数据类型属性,我能够实现以下功能:
<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1">
<Header/>
<DataDictionary>
<DataField name="x" dataType="double" optype="continuous"/>
<DataField name="y" dataType="double" optype="continuous"/>
</DataDictionary>
<TransformationDictionary>
<DefineFunction optype="continuous" name="add">
<ParameterField optype="continuous" name="first"></ParameterField>
<ParameterField optype="continuous" name="second"></ParameterField>
<Apply function="+" invalidValueTreatment="returnInvalid">
<FieldRef field="first"></FieldRef>
<FieldRef field="second"></FieldRef>
</Apply>
</DefineFunction>
<DerivedField name="z" dataType="double" optype="continuous">
<Apply function="add">
<FieldRef field="x"/>
<FieldRef field="y"/>
</Apply>
</DerivedField>
</TransformationDictionary>
</PMML>
我使用的augustus的主干版本相当于0.6-beta3版本。看起来我遇到的问题只是bug,在不久的将来,这个答案中使用的技巧可能会变得不必要。我通过做两个更改来解决这个问题。在查看了augustus源代码并确定,
\u setupCalculate
确实没有在任何地方定义之后,我用monkey修补了它。我的脚本现在如下所示:
<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1">
<Header/>
<DataDictionary>
<DataField name="x" dataType="double" optype="continuous"/>
<DataField name="y" dataType="double" optype="continuous"/>
</DataDictionary>
<TransformationDictionary>
<DefineFunction dataType="float" optype="continuous" name="add">
<ParameterField optype="continuous" name="first"></ParameterField>
<ParameterField optype="continuous" name="second"></ParameterField>
<Apply function="+" invalidValueTreatment="returnInvalid">
<FieldRef field="first"></FieldRef>
<FieldRef field="second"></FieldRef>
</Apply>
</DefineFunction>
<DerivedField name="z" dataType="double" optype="continuous">
<Apply function="add">
<FieldRef field="x"/>
<FieldRef field="y"/>
</Apply>
</DerivedField>
</TransformationDictionary>
</PMML>
# Monkey-patch augustus
import augustus.pmml.DefineFunction
def _setupCalculate(self, dataTable, functionTable, performanceTable):
return (dataTable, functionTable, performanceTable)
augustus.pmml.DefineFunction.DefineFunction._setupCalculate = _setupCalculate
# Now the actual script
from augustus.strict import modelLoader
# Load model
add_two_numbers_file = 'addTwoNumbers.pmml'
with open(add_two_numbers_file, 'r') as model_file:
model_str = model_file.read()
model = modelLoader.loadXml(model_str)
# Run model
print model.calc({'x':[1,2,3],'y':[4,5,6]}).look()
我天真地假设\u setupCalculate
不需要做任何重要的事情。我现在得到了一个不同的、更难以理解的错误:
ValueError: assignment destination is read-only
排队
mask[mask2] = defs.MISSING
在FieldType.py中。经过几次调试之后,我发现这一行仅在类型转换期间执行,并注意到我在PMML中同时使用了float和double类型。通过删除不必要的数据类型属性,我能够实现以下功能:
<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1">
<Header/>
<DataDictionary>
<DataField name="x" dataType="double" optype="continuous"/>
<DataField name="y" dataType="double" optype="continuous"/>
</DataDictionary>
<TransformationDictionary>
<DefineFunction optype="continuous" name="add">
<ParameterField optype="continuous" name="first"></ParameterField>
<ParameterField optype="continuous" name="second"></ParameterField>
<Apply function="+" invalidValueTreatment="returnInvalid">
<FieldRef field="first"></FieldRef>
<FieldRef field="second"></FieldRef>
</Apply>
</DefineFunction>
<DerivedField name="z" dataType="double" optype="continuous">
<Apply function="add">
<FieldRef field="x"/>
<FieldRef field="y"/>
</Apply>
</DerivedField>
</TransformationDictionary>
</PMML>
我使用的augustus的主干版本相当于0.6-beta3版本。看起来我遇到的问题只是bug,在不久的将来,这个答案中使用的技巧可能会变得不必要。jcrudy,你是对的:这是一个bug。(API已更改,但DefineFunction未更新。)它现在已在:with Augustus>=r795中修复,您可以按照最初的预期运行示例 顺便说一下,您的PMML来自一个外部文件,但是您将它加载到一个字符串中,然后加载到PMML DOM中。只需传递
loadXML
文件名,即可跳过中间步骤:
model = modelLoader.loadXml(add_two_numbers_file)
(这可能与非常大的PMML文件有关;还要注意,它们可以gzip。)jcrudy,你是对的:这是一个bug。(API已更改,但DefineFunction未更新。)它现在已在:with Augustus>=r795中修复,您可以按照最初的预期运行示例 顺便说一下,您的PMML来自一个外部文件,但是您将它加载到一个字符串中,然后加载到PMML DOM中。只需传递
loadXML
文件名,即可跳过中间步骤:
model = modelLoader.loadXml(add_two_numbers_file)
(这可能与非常大的PMML文件有关;还要注意,它们可以gzip处理。)