Python pycparser：如何在C文件中获取函数结尾_Python_Pycparser

Python pycparser：如何在C文件中获取函数结尾

python

Python pycparser：如何在C文件中获取函数结尾,python,pycparser,Python,Pycparser,我正在使用pycparser解析C文件。我希望得到C文件中每个函数定义的开始和结束。但我实际得到的只是函数定义的开始 memmgr_init at examples/c_files/memmgr.c:46 get_mem_from_pool at examples/c_files/memmgr.c:55 我希望得到类似于： memmgr_init at examples/c_files/memmgr.c: start :46 end : 52 您不能使用pycparser执行此操作，因为

我正在使用pycparser解析C文件。我希望得到C文件中每个函数定义的开始和结束。但我实际得到的只是函数定义的开始

memmgr_init at examples/c_files/memmgr.c:46
get_mem_from_pool at examples/c_files/memmgr.c:55

我希望得到类似于：

memmgr_init at examples/c_files/memmgr.c: start :46 end : 52

您不能使用pycparser执行此操作，因为它在解析时不记录函数的结束位置

您可以从AST重新生成函数体：

from pycparser import c_parser, c_ast, parse_file, c_generator

class FuncDefVisitor(c_ast.NodeVisitor):
def __init__(self, bodies):
    self.bodies = bodies
    self.generator = c_generator.CGenerator()
def visit_FuncDef(self, node):
    self.bodies.append(self.generator.visit(node))

def show_func_defs(filename):
    ast = parse_file(filename, use_cpp=True,
                 cpp_args=r'-Iutils/fake_libc_include')
    bodies = []
    v = FuncDefVisitor(bodies)
    v.visit(ast)
    for body in bodies:
        print(body)

但是它的格式可能与原始格式略有不同，因此无法用于计算函数结尾后从开头开始的行数。

pycparser不能这样做，因为它在解析时不会记录函数的结束位置

您可以从AST重新生成函数体：

from pycparser import c_parser, c_ast, parse_file, c_generator

class FuncDefVisitor(c_ast.NodeVisitor):
def __init__(self, bodies):
    self.bodies = bodies
    self.generator = c_generator.CGenerator()
def visit_FuncDef(self, node):
    self.bodies.append(self.generator.visit(node))

def show_func_defs(filename):
    ast = parse_file(filename, use_cpp=True,
                 cpp_args=r'-Iutils/fake_libc_include')
    bodies = []
    v = FuncDefVisitor(bodies)
    v.visit(ast)
    for body in bodies:
        print(body)

但是，这可能与原始格式略有不同，因此无法用于计算函数结束后从开始到结束的行数。

我有一个快速而肮脏的解决方案来解决您的问题。您需要做的是从AST获取最近的线路。除非迫不得已，否则我不喜欢修改库。我假设您熟悉解析和数据操作。如果没有，我可以添加更多细节。gcc_或_cpp_输出是由gcc或cpp生成的一些中间代码

ast = parser.parse(gcc_or_cpp_output,filename)

AST的函数有一个show方法和默认参数。您需要为您的问题设置showcoord True

ast.show(buf=fb,attrnames=True, nodenames=True, showcoord=True)

        buf:
            Open IO buffer into which the Node is printed.

        offset:
            Initial offset (amount of leading spaces)

        attrnames:
            True if you want to see the attribute names in
            name=value pairs. False to only see the values.

        nodenames:
            True if you want to see the actual node names
            within their parents.

        showcoord:
            Do you want the coordinates of each Node to be
            displayed

然后需要将buf默认值从sys.stdout更改为您自己的缓冲区类，以便捕获ast图。您也可以遍历树，但我会将树遍历解决方案另存一天。我在下面写了一个简单的假的缓冲区

class fake_buffer():
    def __init__(self):
        self.buffer =[]
    def write(self,string):
        self.buffer.append(string)
    def get_buffer(self):
        return self.buffer

因此，现在需要做的就是保存，将伪缓冲区传递给ast.show（）方法以获取ast

fb = fake_buffer()
ast.show(buf=fb,attrnames=True, nodenames=True, showcoord=True)

此时，您将把AST作为列表。函数声明将在底部附近。现在，您只需要解析所有额外的内容，并在函数delectation中获得最大坐标

  FuncCall <block_items[12]>:  (at ...blah_path_stuff.../year.c:48)

FuncCall:（在…等等路径上的东西…/年。c:48）

ABC

总是编码

我对你的问题有一个快速而肮脏的解决方案。您需要做的是从AST获取最近的线路。除非迫不得已，否则我不喜欢修改库。我假设您熟悉解析和数据操作。如果没有，我可以添加更多细节。gcc_或_cpp_输出是由gcc或cpp生成的一些中间代码

ast = parser.parse(gcc_or_cpp_output,filename)

AST的函数有一个show方法和默认参数。您需要为您的问题设置showcoord True

ast.show(buf=fb,attrnames=True, nodenames=True, showcoord=True)

        buf:
            Open IO buffer into which the Node is printed.

        offset:
            Initial offset (amount of leading spaces)

        attrnames:
            True if you want to see the attribute names in
            name=value pairs. False to only see the values.

        nodenames:
            True if you want to see the actual node names
            within their parents.

        showcoord:
            Do you want the coordinates of each Node to be
            displayed

class fake_buffer():
    def __init__(self):
        self.buffer =[]
    def write(self,string):
        self.buffer.append(string)
    def get_buffer(self):
        return self.buffer

因此，现在需要做的就是保存，将伪缓冲区传递给ast.show（）方法以获取ast

fb = fake_buffer()
ast.show(buf=fb,attrnames=True, nodenames=True, showcoord=True)

此时，您将把AST作为列表。函数声明将在底部附近。现在，您只需要解析所有额外的内容，并在函数delectation中获得最大坐标

  FuncCall <block_items[12]>:  (at ...blah_path_stuff.../year.c:48)

FuncCall:（在…等等路径上的东西…/年。c:48）

ABC

始终进行编码

您需要知道末端的位置，还是只想提取函数体？我只需要函数末端的行号。您需要知道末端的位置，还是只想提取函数体？我只需要函数末端的行号。非常感谢Martin！我应该编写自己的解析机制吗？我考虑实现堆栈来检测描述函数结尾的匹配右括号“}”。或者有没有什么更简单更漂亮的方法。非常感谢你，马丁！我应该编写自己的解析机制吗？我考虑实现堆栈来检测描述函数结尾的匹配右括号“}”。或者有没有什么更简单更漂亮的方法。谢谢