Python 如何保持字符串在某个块中的状态，但在另一个块中更改它_Python_Regex_Logic

Python 如何保持字符串在某个块中的状态，但在另一个块中更改它

python regex logic

Python 如何保持字符串在某个块中的状态，但在另一个块中更改它,python,regex,logic,Python,Regex,Logic,我有一个文件，其中有两个块彼此稍有不同。下面是该文件的内容 Other codes in the file function void one(int x) message_same rest of the code endfunction Other codes in the file function void othercheck ::two(int x) message_same rest of the code endfu

我有一个文件，其中有两个块彼此稍有不同。下面是该文件的内容

 Other codes in the file
 function void one(int x)
     message_same
     rest of the code
  endfunction
 Other codes in the file
  function void othercheck ::two(int x)
      message_same
      rest of the code
   endfunction

   Different codes in the file

我在一个列表中阅读了这个文件，并做了一些更改，希望写入另一个文件

但是我想如果在函数1下看到“message_same”，那么它应该按原样写入，但是如果在函数2下看到，那么它应该删除该行，或者不将该行写入输出文件。其他代码行应保持原样

预期产出：

 Other codes in the file
 virtual function void one(int x)
 message_same
 rest of the code
 endfunction
 Other codes in the file

  function void two:: othercheck(int x)
  rest of the code
  endfunction  
  Different codes in the file

我尝试了以下代码：

for word in words:
     found_one_function=re.search('virtual function',word)
     if found_in_function :
        found_in_end=re.search('endfunction',word)
        if not found_in_end:
            found_in_function=True

   while(found_in_function):
           fw.write(word)
           continue

   if re.search('message_same', word):
        continue

   fw.write(word)

我理解逻辑上这是不对的，但我不确定在找到虚函数之后如何迭代，直到得到最终函数

任何帮助都会很好。

这里有一种方法可以删除每个函数的“message same”行，其中签名包含“function”和“two”。这假设输入文件的结构非常一致

# read file into list of lists (each inner list is a block)
with open('code_blocks.txt', 'r') as f:
    blocks = [block.split('\n') for block in f.read().split('\n\n')]

# iterate over blocks
for block in blocks:
    # if first line contains 'function' and 'two' and second line contains 'message same'
    if 'function' in block[0] and '::' in block[0] and 'message_same' in block[1]:
        # remove message same
        block.pop(block.index(block[1]))

# combine list of lists back into single string and write it out
with open('code_blocks_out.txt', 'w') as f:
    f.write('\n\n'.join(['\n'.join(block) for block in blocks]))

这里有一种方法可以删除每个签名包含“function”和“two”的函数的“message same”行。这假设输入文件的结构非常一致

# read file into list of lists (each inner list is a block)
with open('code_blocks.txt', 'r') as f:
    blocks = [block.split('\n') for block in f.read().split('\n\n')]

# iterate over blocks
for block in blocks:
    # if first line contains 'function' and 'two' and second line contains 'message same'
    if 'function' in block[0] and '::' in block[0] and 'message_same' in block[1]:
        # remove message same
        block.pop(block.index(block[1]))

# combine list of lists back into single string and write it out
with open('code_blocks_out.txt', 'w') as f:
    f.write('\n\n'.join(['\n'.join(block) for block in blocks]))

这相对容易做到-你想要的是迭代你的

单词

列表（假设每个元素包含示例数据中的一行），检查第二个“类型”函数的开头，然后去掉包含

消息的行

，直到遇到唯一的

endfunction

，比如：

# assuming `words` list with each line of your data
# if not it's as easy as: with open("input.txt") as f: words = [line for line in f]
with open("output.txt", "w") as f:  # open output.txt for writing
    in_function = False  # an identifier to tell us we are within a `::` function
    for line in words:  # iterate over words
        if in_function:  # we are inside of a `::` function...
            if line.strip() == "endfunction":  # end of the function
                in_function = False
            elif "message_same" in line:  # skip this line
                continue
        # detect function begin if there is "function" in the line followed with ::
        elif "function" in line and line.find("function") < line.find("::"):
            in_function = True
        f.write(line)  # write the line to the output file
        # f.write("\n")  # uncomment if the lines in your `words` are not terminated

它将生成包含以下内容的

output.txt

：

Other codes in the file function void one(int x) message_same rest of the code endfunction Other codes in the file function void othercheck ::two(int x) message_same rest of the code endfunction Different codes in the file Other codes in the file function void one(int x) message_same rest of the code endfunction Other codes in the file function void othercheck ::two(int x) rest of the code endfunction Different codes in the file 文件中的其他代码函数void one（int x）信息相同代码的其余部分端功能文件中的其他代码函数void othercheck:：two（int x）代码的其余部分端功能文件中的不同代码

您可以拥有任意数量的函数，并且不需要对它们进行排序-处理将仅应用于那些具有相对简单的

：

的函数-您需要的是迭代

单词

列表（假设每个元素包含示例数据中的一行）并检查第二种“类型”函数的开头，然后去掉包含

message_same

的行，直到遇到唯一的

endfunction

，类似于：

# assuming `words` list with each line of your data
# if not it's as easy as: with open("input.txt") as f: words = [line for line in f]
with open("output.txt", "w") as f:  # open output.txt for writing
    in_function = False  # an identifier to tell us we are within a `::` function
    for line in words:  # iterate over words
        if in_function:  # we are inside of a `::` function...
            if line.strip() == "endfunction":  # end of the function
                in_function = False
            elif "message_same" in line:  # skip this line
                continue
        # detect function begin if there is "function" in the line followed with ::
        elif "function" in line and line.find("function") < line.find("::"):
            in_function = True
        f.write(line)  # write the line to the output file
        # f.write("\n")  # uncomment if the lines in your `words` are not terminated

它将生成包含以下内容的

output.txt

：

您可以拥有任意数量的函数，并且它们不需要排序-处理将仅应用于那些在文件中的每一行上迭代的函数；使用标志跟踪进程是否处于

：

函数中；使用该标志放弃相同的

消息行；根据需要修改生产线；将行写入新文件
import re

special = re.compile(r'function.*?::')
in_special_func = False
with open(in_filepath) as in_file, open(out_filepath, 'w') as out_file:
    for line in in_file:
        if special.search(line):
            in_special_func = True
        if 'endfunction' in line:
            in_special_func = False
        if in_special_func and 'message_same' in line:
            #skip
            continue
        # make line modifications here if needed
        # line = modify(line)
        # line = some_variation_of(line)
        # print(line)
        out_file.write(line)


前科
构造一个将捕获完整函数的正则表达式
f_re = re.compile(r'function.*?endfunction', flags = re.DOTALL)

构造一个正则表达式来标识特殊函数
special = re.compile(r'function.*?::')

构造与需要删除的行匹配的正则表达式
message_same = re.compile(r'^\s*message_same\s*\n', flags = re.MULTILINE)

将文件读入字符串：
with open(in_filepath) as in_file:
   s = in_file.read()

迭代所有函数；如果功能是特殊的，则移除该行；对功能进行其他修改；将其写入文件
with open(out_filepath, 'w') as out_file:
   for f in f_re.findall(s):
      #print(f)
      if special.search(f):
         f = message_same.sub('', f)
      # make other changes here
      # assuming the result is a single string
      out_file.write(f)
      #print(f)

迭代文件中的每一行；使用标志跟踪进程是否处于：
函数中；使用该标志放弃相同的消息行；根据需要修改生产线；将行写入新文件
import re

special = re.compile(r'function.*?::')
in_special_func = False
with open(in_filepath) as in_file, open(out_filepath, 'w') as out_file:
    for line in in_file:
        if special.search(line):
            in_special_func = True
        if 'endfunction' in line:
            in_special_func = False
        if in_special_func and 'message_same' in line:
            #skip
            continue
        # make line modifications here if needed
        # line = modify(line)
        # line = some_variation_of(line)
        # print(line)
        out_file.write(line)


前科
构造一个将捕获完整函数的正则表达式
f_re = re.compile(r'function.*?endfunction', flags = re.DOTALL)

构造一个正则表达式来标识特殊函数
special = re.compile(r'function.*?::')

构造与需要删除的行匹配的正则表达式
message_same = re.compile(r'^\s*message_same\s*\n', flags = re.MULTILINE)

将文件读入字符串：
with open(in_filepath) as in_file:
   s = in_file.read()

迭代所有函数；如果功能是特殊的，则移除该行；对功能进行其他修改；将其写入文件
with open(out_filepath, 'w') as out_file:
   for f in f_re.findall(s):
      #print(f)
      if special.search(f):
         f = message_same.sub('', f)
      # make other changes here
      # assuming the result is a single string
      out_file.write(f)
      #print(f)

该文件是否只包含这两个函数，还是应该以迭代模式运行，在每一秒遇到的函数中删除message_same
？它包含几个这样的函数，其中两个函数只是函数的名称。两种函数类型之间的主要区别在于使用了“：：”所以只有包含：：
在他们的签名中应该去掉message\u same
如果遇到？是的，这是正确的，该文件是否只包含这两个函数，还是应该以迭代模式运行，在它遇到的每一秒函数中删除message\u same
？它包含几个这样的函数，其中两个函数只是函数的名称函数。两种函数之间的主要区别在于使用了“：：”，因此只有在签名中包含：
的函数才会被剥离消息，如果遇到相同的？@zwer是的，没错，我实际上有单词=[]我的列表已经包含了所有内容，但我需要在函数中用“：：”去掉“same_message”一行，并将消息保留在以virtual function开头的函数中。我更新了代码以处理函数签名中的“：：”而不是“two”一词。根据您所描述的，您只有一个要求：从任何带有“：”的函数中删除“message_same”。在所有其他情况下，什么也不做。我相信我的回答可以做到这一点。还有其他需要做的事情吗？嗨，我的列表中有Words=[]，已经有了所有内容，但是我需要在带有“：”的函数中去掉“same_message”一行，并将消息保留在以virtual function开头的函数中。我更新了代码，以处理函数签名中带有“：：”的函数，而不是单词“two”。根据您所描述的，您只有一个要求：从任何带有“：”的函数中删除“message_same”。在所有其他情况下，什么也不做。我