使用Python将AST形式的Coq术语转换为波兰符号
假设我有一个任意的Coq术语(使用s表达式/sexp的AST格式),例如:使用Python将AST形式的Coq术语转换为波兰符号,python,tree,abstract-syntax-tree,coq,Python,Tree,Abstract Syntax Tree,Coq,假设我有一个任意的Coq术语(使用s表达式/sexp的AST格式),例如: n=n+n 我想自动将其转换为: =n+n n 通过遍历AST树(由于sexp,它是一个简单的嵌套列表)。Python中是否有一个标准库可以做到这一点 现在,如果我写下算法/伪代码,我会这样做(假设我可以将sexp转换为实际的树对象): 我认为这很接近,但我认为某个地方有一个小虫子 示例AST: (ObjList ((CoqGoal ((fg_goals (((name 4)
n=n+n
我想自动将其转换为:
=n+n n
通过遍历AST树(由于sexp,它是一个简单的嵌套列表)。Python中是否有一个标准库可以做到这一点
现在,如果我写下算法/伪代码,我会这样做(假设我可以将sexp转换为实际的树对象):
我认为这很接近,但我认为某个地方有一个小虫子
示例AST:
(ObjList
((CoqGoal
((fg_goals
(((name 4)
(ty
(App
(Ind
(((Mutind (MPfile (DirPath ((Id Logic) (Id Init) (Id Coq))))
(DirPath ()) (Id eq))
0)
(Instance ())))
((Ind
(((Mutind (MPfile (DirPath ((Id Datatypes) (Id Init) (Id Coq))))
(DirPath ()) (Id nat))
0)
(Instance ())))
(App
(Const
((Constant (MPfile (DirPath ((Id Nat) (Id Init) (Id Coq))))
(DirPath ()) (Id add))
(Instance ())))
((Construct
((((Mutind
(MPfile (DirPath ((Id Datatypes) (Id Init) (Id Coq))))
(DirPath ()) (Id nat))
0)
1)
(Instance ())))
(Var (Id n))))
(Var (Id n)))))
(hyp
((((Id n)) ()
(Ind
(((Mutind (MPfile (DirPath ((Id Datatypes) (Id Init) (Id Coq))))
(DirPath ()) (Id nat))
0)
(Instance ())))))))))
(bg_goals ()) (shelved_goals ()) (given_up_goals ())))))
以上只是:
(ObjList
((CoqString "\
\n n : nat\
\n============================\
\n0 + n = n"))))
或
使用sexp解析器:
[Symbol('Answer'), 2, [Symbol('ObjList'), [[Symbol('CoqGoal'), [[Symbol('fg_goals'), [[[Symbol('name'), 4], [Symbol('ty'), [Symbol('App'), [Symbol('Ind'), [[[Symbol('Mutind'), [Symbol('MPfile'), [Symbol('DirPath'), [[Symbol('Id'), Symbol('Logic')], [Symbol('Id'), Symbol('Init')], [Symbol('Id'), Symbol('Coq')]]]], [Symbol('DirPath'), []], [Symbol('Id'), Symbol('eq')]], 0], [Symbol('Instance'), []]]], [[Symbol('Ind'), [[[Symbol('Mutind'), [Symbol('MPfile'), [Symbol('DirPath'), [[Symbol('Id'), Symbol('Datatypes')], [Symbol('Id'), Symbol('Init')], [Symbol('Id'), Symbol('Coq')]]]], [Symbol('DirPath'), []], [Symbol('Id'), Symbol('nat')]], 0], [Symbol('Instance'), []]]], [Symbol('App'), [Symbol('Const'), [[Symbol('Constant'), [Symbol('MPfile'), [Symbol('DirPath'), [[Symbol('Id'), Symbol('Nat')], [Symbol('Id'), Symbol('Init')], [Symbol('Id'), Symbol('Coq')]]]], [Symbol('DirPath'), []], [Symbol('Id'), Symbol('add')]], [Symbol('Instance'), []]]], [[Symbol('Construct'), [[[[Symbol('Mutind'), [Symbol('MPfile'), [Symbol('DirPath'), [[Symbol('Id'), Symbol('Datatypes')], [Symbol('Id'), Symbol('Init')], [Symbol('Id'), Symbol('Coq')]]]], [Symbol('DirPath'), []], [Symbol('Id'), Symbol('nat')]], 0], 1], [Symbol('Instance'), []]]], [Symbol('Var'), [Symbol('Id'), Symbol('n')]]]], [Symbol('Var'), [Symbol('Id'), Symbol('n')]]]]], [Symbol('hyp'), [[[[Symbol('Id'), Symbol('n')]], [], [Symbol('Ind'), [[[Symbol('Mutind'), [Symbol('MPfile'), [Symbol('DirPath'), [[Symbol('Id'), Symbol('Datatypes')], [Symbol('Id'), Symbol('Init')], [Symbol('Id'), Symbol('Coq')]]]], [Symbol('DirPath'), []], [Symbol('Id'), Symbol('nat')]], 0], [Symbol('Instance'), []]]]]]]]]], [Symbol('bg_goals'), []], [Symbol('shelved_goals'), []], [Symbol('given_up_goals'), []]]]]]]
恐怕我不知道有哪个模块可以做你想做的,但是如果我已经正确地理解了你想做的事情,正如你所确定的,你只需要遍历树。您的递归实现只是稍微有点偏差。这个怎么样
class Node:
def __init__(self, text, children=None):
self.text = text
self.children = children or []
def is_atomic(self):
return not self.children
def to_polish(root):
''' postfix tree traversal '''
return root.text + ' ' + ''.join(to_polish(n) for n in root.children)
print(to_polish(
Node('=', [
Node('a', [
Node('+', [
Node('b'),
Node('c')
])
])
])
))
我已经看过你的大单子了 在顶部/底部,我们有:
`['Answer', 2, nested_list]`
TEST_DATA1 = ["+", 2, ["+", 1, 1, 1] ]
# you can put **ALL** of the inputs to update_args
# inside of the constructor call. However, it is
# then very difficult to read.
loligag = Lol2Str(open_delim="(", close_delim=")")
loligag.update_args(comma_delim =",")
loligag.update_args(is_operaND= lambda x: isinstance(x, int))
loligag.update_args(is_operaTOR = lambda x: x == "+")
loligag(TEST_DATA1, file = sys.stdout)
好的,看起来'Answer'
是接线员'Answer'
有两个参数,2
和nested\u list
因此,在英语中,它表示“嵌套列表的答案是2
”
如果我们再深入一点呢
['Answer', 2, ['ObjList', 'nested_list']]
现在我们有:
Answer(2, ObjList(nested_list))
更深入地说,我们看到
['Answer',
2,
['ObjList',
[
['CoqGoal',
['list', 'list', 'list', 'list']
]
]
]
]
在我看来,您的列表已按预定顺序排列。如果没有,请务必给我留下一些评论。也许剩下的就是把它转换成字符串?再说一次,我不确定你想要什么
在中缀表示法中,运算符(+
,*
等)也是分隔符
然而,在前缀表示法(又名“波兰语表示法”)中,更需要分隔符
输出=n+n
非常不明确。这与非常清楚的,=(n,+(n,n))
相比较
我编写了一个程序,它使用分隔符(
,)
,和,
。但是,您可以通过设置以下内容来消除这些问题:
open_delim = ""
close_delim = ""
comma_delim = ""
在小型试运行中,我们有:
`['Answer', 2, nested_list]`
TEST_DATA1 = ["+", 2, ["+", 1, 1, 1] ]
# you can put **ALL** of the inputs to update_args
# inside of the constructor call. However, it is
# then very difficult to read.
loligag = Lol2Str(open_delim="(", close_delim=")")
loligag.update_args(comma_delim =",")
loligag.update_args(is_operaND= lambda x: isinstance(x, int))
loligag.update_args(is_operaTOR = lambda x: x == "+")
loligag(TEST_DATA1, file = sys.stdout)
输出为:
+(2+(1,1,1))
可以将分隔符设置为空格字符。这更像是原始示例中的=n+nn
,而不是=(n,+(n,n))
输出:
+ 2+ 1 1 1
所有列表元素要么是符号,如Symbol('ty')
,要么是整数,如4
。如果我要完全回答你的问题,那么我需要知道像Symbol('ty')
这样的东西哪些是运算符,哪些是操作数
如果我在你的数据上运行代码
loligag = Lol2Str(open_delim="(", close_delim=")")
loligag.update_args(comma_delim =",")
loligag.update_args(is_operaND= lambda x: isinstance(x, type(99)))
loligag.update_args(is_operaTOR = lambda x: isinstance(x, Symbol))
r = loligag(L, file = sys.stdout)
print("\nreturned: ", r)
输出是
Answer(2ObjList((CoqGoal(fg_goals(name(4)ty(App(Ind((Mutind(MPfile(DirPath((IdLogicId(Init)Id(Coq))))DirPath([])Id(eq)),0Instance([])))(Ind(Mutind(MPfile(DirPath((IdDatatypesId(Init)Id(Coq))))DirPath([])Id(nat)),0Instance([]))App(Const((ConstantMPfile(DirPath((IdNatId(Init)Id(Coq))))DirPath([])Id(add)Instance([])))(Construct((MutindMPfile(DirPath((IdDatatypesId(Init)Id(Coq))))DirPath([])Id(nat),0),1Instance([]))Var(Id(n))))Var(Id(n)))))hyp(((Idn)[]Ind((Mutind(MPfile(DirPath((IdDatatypesId(Init)Id(Coq))))DirPath([])Id(nat)),0Instance([]))))))bg_goals([])shelved_goals([])given_up_goals([])))))
。。。这显然是错误的
你有很多空名单。显然,[符号(“+”)1、2、3]
是有意义的。[Symbol(“+”)1、[Symbol(“+”)1,1],3]
也是如此。然而,[Symbol(“+”),1,[],3]
应该是什么意思?[]
是否应该是操作数
代码如下所示。随便玩玩吧。您可以忽略FileDescriptor
类。相反,我鼓励您在上执行edit>find
呼叫
尝试写入运算符
写入操作数
您想专门使用Python有什么原因吗?我这样问是因为在OCaml中这样做要容易得多,因为您可以访问Coq源代码中的AST操作函数。@ArthurAzevedoDeAmorim我可以很容易地从SerAPI()获得s表达式,并将它们作为列表列表。我不懂ocaml,我也懂python。如果你对使用lisp语法的python感兴趣,你应该看看[Hy][.他们搞乱了python AST模块,为python创建了一个新的lisp语法,这可能会引导您走向正确的方向。您的伪代码看起来几乎像python,但python中的赋值运算符由=
表示,而不是:=
在您的伪代码中,您编写了“修复后树遍历”。但是,波兰语表示法是前缀表示法。波兰语表示法和反向波兰语表示法不一样。您想要哪一种?作为参考,前缀表示法是+(n,n)
,在fix表示法中是(n+n)
,而后缀表示法是(n,n)+
。前缀表示法是波兰语表示法。后缀表示法是反向波兰语表示法。
Answer(2ObjList((CoqGoal(fg_goals(name(4)ty(App(Ind((Mutind(MPfile(DirPath((IdLogicId(Init)Id(Coq))))DirPath([])Id(eq)),0Instance([])))(Ind(Mutind(MPfile(DirPath((IdDatatypesId(Init)Id(Coq))))DirPath([])Id(nat)),0Instance([]))App(Const((ConstantMPfile(DirPath((IdNatId(Init)Id(Coq))))DirPath([])Id(add)Instance([])))(Construct((MutindMPfile(DirPath((IdDatatypesId(Init)Id(Coq))))DirPath([])Id(nat),0),1Instance([]))Var(Id(n))))Var(Id(n)))))hyp(((Idn)[]Ind((Mutind(MPfile(DirPath((IdDatatypesId(Init)Id(Coq))))DirPath([])Id(nat)),0Instance([]))))))bg_goals([])shelved_goals([])given_up_goals([])))))
class Symbol:
def __init__(i, x):
if isinstance(x, type(i)):
raise NotImplementedError()
elif isinstance(x, str):
i._stryng = x.strip()
else:
raise NotImplementedError()
def __str__(i):
return i._stryng
def __repr__(i):
return repr(i._stryng)
import io
import itertools
import sys
import random
import string
import inspect
class FileDescriptor:
"""
There is one descriptor instance
for the one Lol2Str class.
However, there are many instances
of the Lol2Str class.
when a new `Lol2Str` comes into existence
i._master_table[id(instance)] is created
Lol2Str has a custom `__del__` method
this deletes id(instance) from the master
table.
`i._master_table[id(instance)]`
is `True` if the Lol2Str instance
owns the file and is responsible
for closing it. The value is `False`
if something else closes the file.
"""
def __init__(i):
i._master_table = dict()
@classmethod
def xkey2ikey(i, xkey):
ikey = xkey
if not isinstance(xkey, type(id(i))):
ikey = id(xkey)
return ikey
def close(i, xkey):
# master_table(True, file)
# ...... there is a file, we close it.
#
# master_table(False, file)
# ..... there is a file, we do **NOT** close it
#
# master_table(False, None)
# ..... there is no file
#
# master_table(True, None)
# ...... error. we close non-existant file.
ikey = type(i).xkey2ikey(xkey)
file = i._master_table[ikey][1]
we_own = i._master_table[ikey][1]
if we_own:
file.close()
i._master_table[ikey][0] = False
i._master_table[ikey][1] = None
else:
msg = ""
raise i.FileDescriptorException(msg)
def _update_dead_kin(i, xkey):
ikey = i.xkey2ikey(xkey)
we_own_file = i._master_table[ikey][0]
if we_own_file:
file = i._master_table[ikey][0][1]
file.close()
del i._master_table[ikey]
return
@classmethod
def verify_filestream(i, new_file):
if not (hasattr(new_file, "write")):
with io.StringIO() as string_stream:
print(
object.__repr__(new_file)
,
"has no `write` method"
,
file=string_stream
)
msg = string_stream.getvalue()
raise type(i).Lol2StrException(msg)
if not callable(new_file.write):
with io.StringIO() as string_stream:
print(
object.__repr__(new_file)
,
"has no `write` method"
,
file=string_stream
)
msg = string_stream.getvalue()
raise type(i).Lol2StrException(msg)
def __get__(i, xkey, xkey_klass):
if xkey:
ikey = i.xkey2ikey(xkey)
file_is_present = i._master_table[ikey][1]
if not file_is_present:
i.set_file_to_new_string_stream(ikey)
return i.Shell(i, ikey)
else:
# called as class method instead of
# as instance method
return i
class Shell:
def __init__(i, descriptor, ikey):
i._descriptor = descriptor
i._ikey = ikey
def write(i, material):
return i._descriptor.write(i._ikey, material)
def close(i):
return i._descriptor.close(i._ikey)
def getvalue(i):
return i._descriptor.getvalue(i._ikey)
def getvalue(i, xkey):
ikey = i.xkey2ikey(xkey)
file_is_str_stream = i._master_table[ikey][0]
if file_is_str_stream:
file = i._master_table[ikey][1]
r = file.getvalue()
else:
with io.StringIO() as string_stream:
print(
"`getvalue` is only allowed if",
"user did not input their own file",
file=string_stream
)
msg = string_stream.getvalue()
raise i.FileDescriptorException(msg)
return r
def write(i, xkey, material):
ikey = i.xkey2ikey(xkey)
file = i._master_table[ikey][1]
file.write(material)
def set_file_to_new_string_stream(i, ikey):
ikey = i.xkey2ikey(loligag)
file = i._master_table[ikey][1]
# if there is a file
if file:
msg = ''.join((
"\n\nCannot create string stream when target file",
" is already present.",
"\nTarget file already set to be: ",
object.__repr__(file)
))
raise i.FileDescriptorException(msg)
# end if
file = io.StringIO()
i._master_table[ikey][0] = True
i._master_table[ikey][1] = file
return
def __set__(i, xkey, new_file):
"""
`new_file = None` is allowed
That means a different function
was called with default arguments.
The default argument for file is `None`
We cannot simply set file to a new string
stream as a default argument...
func(lol, file=io.StringIO())
... then we don't know if owner of the
file stream is the caller or callee.
"""
ikey = i.xkey2ikey(xkey)
if new_file:
# new_file is not `None`
type(i).verify_filestream(new_file)
we_own_file = i._master_table[ikey][0]
if we_own_file:
old_file = i._master_table[ikey][1]
stryng = old_file.getvalue()
new_file.write(stryng)
old_file.close()
# end-if
i._master_table[ikey][1] = new_file
i._master_table[ikey][0] = False
else:
# `new file` is `None` or `False` or `0`
# or some other such non-sense
i.set_file_to_new_string_stream(xkey)
return
class FileDescriptorException(Exception):
pass
def register(i, xkey):
ikey = i.xkey2ikey(xkey)
i._master_table[ikey] = [False, None]
return
# END FileDescriptorClass
class Lol2Str:
"""
`lol`..............`list of lists`
See doc string for `__call__` method
"""
ALLOWED_ARGS = frozenset((
"open_delim",
"close_delim",
"comma_delim",
"is_operaND",
"is_operaTOR",
"file"
))
file = FileDescriptor()
def __init__(i, **kwargs):
"""
"""
type(i).file.register(i)
i.update_args(**kwargs)
# looks weird without a return statement,
# but __init__ not allowed to return
# END OF __init__
class Lol2StrException(Exception):
pass
def update_args(i, **kwargs):
for kw in kwargs.keys():
if kw in type(i).ALLOWED_ARGS:
kwarg = kwargs[kw]
if not isinstance(kwarg, type(None)):
# if not default argument:
setattr(i, kw, kwargs[kw])
# else:
# leave arg as it's previous value
else:
with io.StringIO() as ss:
print(
kw, "is not an allowed argument.",
file=ss
)
msg = ss.getvalue()
raise type(i).Lol2StrException(msg)
def verify_ready_all(i):
one_name = "something more"
try:
for one_name in type(i).ALLOWED_ARGS:
i.verify_ready_one(one_name)
finally:
pass
return
def verify_ready_one(i, one_name):
try:
one = getattr(i, one_name)
except AttributeError:
with io.StringIO() as ss:
print(
"You must set", repr(one_name), "before executing",
file=ss
)
msg = ss.getvalue()
raise type(i).Lol2StrException(msg)
except RecursionError:
msg = "Infinite loop on " + one_name
raise RecursionError(msg)
if isinstance(one, type(None)):
with io.StringIO() as string_stream:
print(
object.__repr__(new_file), "is",
"is missing input argument", one_name,
file=string_stream
)
msg = string_stream.getvalue()
raise type(i).Lol2StrException(msg)
return
def send_out(i, *args, sep=" ", end=""):
return print(*args, end=end,file= i.file)
def attempt_write_operaTOR(i, lol):
try:
elem = next(lol)
elem_status = (
True if i.is_operaTOR(elem) else False,
True if i.is_operaND(elem) else False
)
if elem_status == (True, True):
with io.StringIO() as ss:
print(
"A parsed item was determined to be both",
"an operAND and an operATOR. That is not allowed.",
file=ss
)
s = ss.getvalue()
raise type(i).Lol2StrException(s)
elif elem_status == (True, False):
# Inside of this if-statement
# `elem` is an operator, such as the plus sign in `2 + 3`
i.send_out(elem, end="")
elif elem_status == (False, True):
# `elem` is an argument/operand
# such as the `2` in `2 + 3`
lol = itertools.chain((elem,), lol)
else:
# (operATOR, operAND) = (False, False)
# Not operator and not operand either.
if hasattr(elem, "__iter__"):
###################################################
# Strings are a problem in python
# iter('a') == iter(iter(iter(iter(iter('a')))))
# 'a'[0][0][0][0][0][0] = 'a'[0]
##################################################
#
# redundantly nested args, such as `1 + (((2 + 3)))`
lol = itertools.chain(elem, lol)
else:
# Not operator and not operand either.
# Not redundantly nested args
with io.StringIO() as ss:
print(
type(i).__name__,
"found something which is not an operator, operand,",
"or anything else it knows how to handle.",
"\nThe object was ", object.__repr__(elem),
file=ss
)
s = ss.getvalue()
raise type(i).Lol2StrException(s)
# end if-else block
# end if-else block
finally:
pass
return lol
def write_operaNDs(i, lol):
try:
comma_state = False
i.send_out(i.open_delim)
for elem in lol:
if i.is_operaND(elem):
pass # You're golden. move on to the next step
else:
# element contains nested elements
try:
x = i(elem)
if isinstance(x, type(None)):
elem = ""
comma_state = False
else:
elem = x
except StopIteration:
# element was an EMPTY LIST!!
# WHY AN EMPTY LIST?!? Who only knows...
elem = ""
comma_state = False
# end if-else
if comma_state:
i.send_out(i.comma_delim)
i.send_out(elem)
if not comma_state:
comma_state = True
# end for-loop
i.send_out(i.close_delim)
finally:
pass
return lol
def __call__(i, lol, **kwargs):
"""
NOTES ON NOMENCLATURE:
`delim` ..........`delimiter`
`lol`.............`list of lists`
INPUT ARGUMENTS:
`file` ............ `file` has some sort of `write` method
Any outside file is NOT closed for you.
You must close it youri.
loligag = Lol2Str()
with open(p , mode = "w") as outf:
loligag(lol, file = outf)
loligag(lol, file = outf)
loligag(lol, file = outf)
# `with` statement closes the file for you
Example1: # send output to the console
import sys
loligag(lol, file=sys.stdout)
Example 2: # also send output to the console
file = type("_", tuple(), {"write": lambda i, x: print(x)})()
loligag(lol, file=sys.stdout)
Example 3:
with open("myfile.dat"), mode="w") as outf:
lol_to_prefix_str(lol, file=outf)
If `file = None` then `instance.__call__`
returns a string
`is_operaTOR`... the plus sign in `2 + 3` is an example of an operator
`is_operaTOR` is a callable receiving one argument.
`is_operaTOR(x)` is to return True if and only if `x`
is an operator. `is_operaTOR(x)` is to return False otherwise.
###################################################
# Strings are a problem in python
# iter('a') == iter(iter(iter(iter(iter('a')))))
# 'a'[0][0][0][0][0][0] = 'a'[0]
#
# do **NOT** set is_operaTOR = lambda x: hasattr(x, "__iter__")
####################################################
`lol` ............. list of lists
`open_delim` ...... opening delimeter, such as left parenthesis "("
TODO: documentation for
close_delim
comma_delim
is_operaND
"""
return_val = None
try:
i.update_args(**kwargs)
############################################
i.verify_ready_all() #
############################################
# `open_delim` is not `None`
# `is_operaTOR` is not `None`
# `close_delim` is not `None`
# `comma_delim` is not `None`
# `is_operaND` is not `None`
# `file` is not `None`
############################################
if hasattr(lol, "__iter__"):
try:
lol = type(i).sanitize_lol(lol)
done = False
except type(i).EmptyLol as exc:
# `lol` is empty
# assume empty list, empty tuple
# empty whatever it is, is an operand
i.send_out(lol)
done = True
if not done:
# At this point, lol contains more than one item
lol = i.attempt_write_operaTOR(lol)
lol = i.write_operaNDs(lol)
else:
# ONLY ONE ITEM
# must be an operand?
i.send_out(lol)
# if user originally failed to input a file:
# get string from string stream `i.file`
# close the string stream.
# return the string
we_own_the_file = type(i).file._master_table[id(i)][0]
if we_own_the_file:
return_val = i.file.getvalue()
i.file.close()
finally:
pass
return return_val
@classmethod
def sanitize_lol(cls, lol):
try:
lol = iter(lol)
try:
elem = next(lol)
lol = itertools.chain([elem], lol)
except StopIteration:
with io.StringIO() as err_stream:
print(
"Unable to convert empty container",
"into prefix notation",
file=err_stream
)
msg = err_stream.getvalue()
raise cls.EmptyLol(msg)
finally:
pass
finally:
pass
return lol
class EmptyLol(Exception):
pass
def __del__(i, *args, **kwargs):
type(i).file._update_dead_kin(i)
super_class = inspect.getmro(type(i))[1]
if super_class != object:
super_class.__del__(i, *args, **kwargs)