Python 读取文件以确定规则
我有一个excel文件,其中包含用户定义的业务规则,如下所示:Python 读取文件以确定规则,python,csv,Python,Csv,我有一个excel文件,其中包含用户定义的业务规则,如下所示: 列|名称|运算符|列|值1 |操作数|规则ID |结果 ABC |相等| 12 |和| 1 | 1 CDE |相等| 10 |和| 1 | 1 XYZ | Equal | AD | 1 | 1.5 ABC |相等| 11 |和| 2 | 1 CDE |相等| 10 | 2 | 1.2 等等。(仅出于格式化目的,请输入|符号) 输入文件(CSV)如下所示: ABC、CDE、XYZ 公元12年10月 公元11年10月 这里的目标是派
列|名称|运算符|列|值1 |操作数|规则ID |结果
ABC |相等| 12 |和| 1 | 1
CDE |相等| 10 |和| 1 | 1
XYZ | Equal | AD | 1 | 1.5
ABC |相等| 11 |和| 2 | 1
CDE |相等| 10 | 2 | 1.2
等等。(仅出于格式化目的,请输入|符号)
输入文件(CSV)如下所示:
ABC、CDE、XYZ
公元12年10月
公元11年10月
这里的目标是派生一个名为Result的输出列,该列需要查找用户定义的业务规则excel
预期产出:
ABC,CDE,XYZ,Result
12,10,AD,1.5
11,10,AD,1.2
到目前为止,我尝试生成一个if
语句,并尝试将整个if/elif
语句分配给一个函数。这样我就可以把它传给下面的语句来应用规则
ouput_df['result'] = input_df.apply(result_func, axis=1)
当我有手动编码规则的功能时,其工作原理如下所示:
def result_func(input_df):
if (input_df['ABC'] == 12):
return '1.25'
elif (ip_df['ABC'] == 11):
return '0.25'
else:
return '1'
这是处理这种情况的正确方法吗?如果是这样,我如何将整个动态生成的If/elif
传递给函数?code
import pandas as pd
import csv
# Load rules table
rules_table = []
with open('rules.csv') as csvfile:
reader = csv.DictReader(csvfile, delimiter='|')
for row in reader:
rules_table.append([x.strip() for x in row.values()])
# Load CSV file into DataFrame
df = pd.read_csv('data.csv', sep=",")
def rules_eval(row, rules):
" Steps through rules table for appropriate value "
def operator_eval(op, col, value):
if op == 'Equal':
return str(row[col]) == str(value)
else:
# Curently only Equal supported
raise ValueError(f"Unsupported Operator Value {op}, only Equal allowed")
prev_rule = '~'
for col, op, val, operand, rule, res in rules:
# loop through rows of rule table
if prev_rule != rule:
# rule ID changed so we can follow rule chains again
ignore_rule = False
if not ignore_rule:
if operator_eval(op, col, val):
if operand != 'and':
return res
else:
# Rule didn't work for an item in group
# ignore subsequent rules with this id
ignore_rule = True
prev_rule = rule
return None
df['results'] = df.apply(lambda row: rules_eval(row, rules_table), axis=1)
print(df)
输出
ABC CDE XYZ results
0 12 10 AD 1.5
1 11 10 AD 1.2
解释
df.apply-将规则评估
函数应用于数据帧的每一行
通过将输出放入“结果”列
df['result'] = ...
处理规则优先级
改变
向rules_表中添加了优先级列,以便按优先级顺序处理具有相同RuleID的规则
优先级顺序由添加到堆中的元组顺序决定,当前
Priority, Column_Name, Operator, Column_Value, Operand, RuleID, Result
代码
规则表
输出
谢谢你的回复,DarrylG。但我这里的问题是读取excel文件以找出将其转换为If语句的规则。之后,如何将其传递给函数。例如,在我的第一篇文章中,如果您看到excel业务规则,将动态获取的if条件如下所示,if('ABC'==12和'CDE']==10和XYZ='AD'):return'1.25'elif('ABC'==11和'CDE'==10):return'1.2'。如果我能够通过读取excel动态生成这个。如何将此语句传递给函数?@pythoner——感谢您的解释。使用解析器
rules\u eval
更新了我的答案,该解析器遍历规则表以确定适当的值。非常感谢,您太棒了。我有几个问题,1。如果我的规则表是xlsx而不是csv,是否有等效的excel.DictReader?2.如果我引入一个新的字段-子规则来标识规则中的顺序,是否有一种方法可以基于子规则退出该规则,而不是检查“and”?@pythoner--1。建议尝试使用Excel工作簿(而不是CSV)。2.当然可以,但必须了解更多有关所需输入和行为的信息。@pythoner您是否在问如何有一个子规则列,例如子规则3,1,2,2,1,它与RuleID列一起,RuleID列的值为1,1,1,2,2?对于规则ID 1,我们有子规则序列3,1,2,对于规则2,我们有子规则序列2,1。我们希望按照子规则的顺序1、2、3应用1的规则。这就是您的意思吗?您可以查看一下,然后在业务规则和Python运算符之间创建一个映射:ops={'Equal':operator.eq}
然后应用该函数。
import pandas as pd
import csv
from collections import namedtuple
from heapq import (heappush, heappop)
# Load CSV file into DataFrame
df = pd.read_csv('data.csv', sep=",")
class RulesEngine():
###########################################
# Static members
###########################################
# Named tuple for rules
fieldnames = 'Column_Name|Operator|Column_Value1|Operand|RuleID|Priority|Result'
Rule = namedtuple('Rule', fieldnames.replace('|', ' '))
number_fields = fieldnames.count('|') + 1
###########################################
# members
###########################################
def __init__(self, table_file):
# Load rules table
rules_table = []
with open(table_file) as csvfile:
reader = csv.DictReader(csvfile, delimiter='|')
for row in reader:
fields = [self.convert(x.strip()) for x in row.values() if x is not None]
if len(fields) != self.number_fields:
# Incorrect number of values
error = f"Rules require {self.number_fields} fields per row, was given {len(fields)}"
raise ValueError(error)
rules_table.append([self.convert(x.strip()) for x in row.values()])
#rules_table.append([x.strip() for x in row.values()])
self.rules_table = rules_table
def convert(self, s):
" Convert string to (int, float, or leave current value) "
try:
return int(s)
except ValueError:
try:
return float(s)
except ValueError:
return s
def operator_eval(self, row, rule):
" Determines value for a rule "
if rule.Operator == 'Equal':
return str(row[rule.Column_Name]) == str(rule.Column_Value1)
else:
# Curently only Equal supported
error = f"Unsupported Operator {rule.Operator}, only Equal allowed"
raise ValueError(error)
def get_rule_value(self, row, rule_queue):
" Value of a rule or None if no matching rule "
found_match = True
while rule_queue:
priority, rule_to_process = heappop(rule_queue)
if not self.operator_eval(row, rule_to_process):
found_match = False
break
return rule_to_process.Result if found_match else None
def rules_eval(self, row):
" Steps through rules table for appropriate value "
rule_queue = []
for index, r in enumerate(self.rules_table):
# Create named tuple with current rule values
current_rule = self.Rule(*r)
if not rule_queue or \
rule_queue[-1][1].RuleID == current_rule.RuleID:
# note: rule_queue[-1][1].RuleID is previous rule
# Within same rule group or last rule of group
priority = current_rule.Priority
# heap orders rules by pririty
# (lowest numbers are processed first)
heappush(rule_queue, (priority, current_rule))
if index < len(self.rules_table)-1:
continue # not at last rule, so keep accumulating
# Process rules in the rules queue
rule_value = self.get_rule_value(row, rule_queue)
if rule_value:
return rule_value
else:
# Starting over with new rule group
rule_queue = []
priority = current_rule.Priority
heappush(rule_queue, (priority, current_rule))
# Process Final queue if not empty
return self.get_rule_value(row, rule_queue)
# Init rules engine with rules from CSV file
rules_engine = RulesEngine('rules.csv')
df['results'] = df.apply(rules_engine.rules_eval, axis=1)
print(df)
ABC,CDE,XYZ
12,10,AD
11,10,AD
12,12,AA
Column_Name|Operator|Column_Value1|Operand|RuleID|Priority|Result
ABC | Equal| 12| and| 1| 2|1
CDE | Equal| 10| and| 1| 1|1
XYZ | Equal| AD| and| 1| 3|1.5
ABC | Equal| 11| and| 2| 1|1
CDE | Equal| 10| foo| 2| 2|1.2
ABC | Equal| 12| foo| 3| 1|1.8
ABC CDE XYZ results
0 12 10 AD 1.5
1 11 10 AD 1.2
2 12 12 AA 1.8