Python 如何保持字符串在某个块中的状态,但在另一个块中更改它
我有一个文件,其中有两个块彼此稍有不同。下面是该文件的内容Python 如何保持字符串在某个块中的状态,但在另一个块中更改它,python,regex,logic,Python,Regex,Logic,我有一个文件,其中有两个块彼此稍有不同。下面是该文件的内容 Other codes in the file function void one(int x) message_same rest of the code endfunction Other codes in the file function void othercheck ::two(int x) message_same rest of the code endfu
Other codes in the file
function void one(int x)
message_same
rest of the code
endfunction
Other codes in the file
function void othercheck ::two(int x)
message_same
rest of the code
endfunction
Different codes in the file
我在一个列表中阅读了这个文件,并做了一些更改,希望写入另一个文件
但是我想如果在函数1下看到“message_same”,那么它应该按原样写入,但是如果在函数2下看到,那么它应该删除该行,或者不将该行写入输出文件。其他代码行应保持原样
预期产出:
Other codes in the file
virtual function void one(int x)
message_same
rest of the code
endfunction
Other codes in the file
function void two:: othercheck(int x)
rest of the code
endfunction
Different codes in the file
我尝试了以下代码:
for word in words:
found_one_function=re.search('virtual function',word)
if found_in_function :
found_in_end=re.search('endfunction',word)
if not found_in_end:
found_in_function=True
while(found_in_function):
fw.write(word)
continue
if re.search('message_same', word):
continue
fw.write(word)
我理解逻辑上这是不对的,但我不确定在找到虚函数之后如何迭代,直到得到最终函数
任何帮助都会很好。这里有一种方法可以删除每个函数的“message same”行,其中签名包含“function”和“two”。这假设输入文件的结构非常一致
# read file into list of lists (each inner list is a block)
with open('code_blocks.txt', 'r') as f:
blocks = [block.split('\n') for block in f.read().split('\n\n')]
# iterate over blocks
for block in blocks:
# if first line contains 'function' and 'two' and second line contains 'message same'
if 'function' in block[0] and '::' in block[0] and 'message_same' in block[1]:
# remove message same
block.pop(block.index(block[1]))
# combine list of lists back into single string and write it out
with open('code_blocks_out.txt', 'w') as f:
f.write('\n\n'.join(['\n'.join(block) for block in blocks]))
这里有一种方法可以删除每个签名包含“function”和“two”的函数的“message same”行。这假设输入文件的结构非常一致
# read file into list of lists (each inner list is a block)
with open('code_blocks.txt', 'r') as f:
blocks = [block.split('\n') for block in f.read().split('\n\n')]
# iterate over blocks
for block in blocks:
# if first line contains 'function' and 'two' and second line contains 'message same'
if 'function' in block[0] and '::' in block[0] and 'message_same' in block[1]:
# remove message same
block.pop(block.index(block[1]))
# combine list of lists back into single string and write it out
with open('code_blocks_out.txt', 'w') as f:
f.write('\n\n'.join(['\n'.join(block) for block in blocks]))
这相对容易做到-你想要的是迭代你的
单词
列表(假设每个元素包含示例数据中的一行),检查第二个“类型”函数的开头,然后去掉包含消息的行
,直到遇到唯一的endfunction
,比如:
# assuming `words` list with each line of your data
# if not it's as easy as: with open("input.txt") as f: words = [line for line in f]
with open("output.txt", "w") as f: # open output.txt for writing
in_function = False # an identifier to tell us we are within a `::` function
for line in words: # iterate over words
if in_function: # we are inside of a `::` function...
if line.strip() == "endfunction": # end of the function
in_function = False
elif "message_same" in line: # skip this line
continue
# detect function begin if there is "function" in the line followed with ::
elif "function" in line and line.find("function") < line.find("::"):
in_function = True
f.write(line) # write the line to the output file
# f.write("\n") # uncomment if the lines in your `words` are not terminated
它将生成包含以下内容的output.txt
:
Other codes in the file
function void one(int x)
message_same
rest of the code
endfunction
Other codes in the file
function void othercheck ::two(int x)
message_same
rest of the code
endfunction
Different codes in the file
Other codes in the file
function void one(int x)
message_same
rest of the code
endfunction
Other codes in the file
function void othercheck ::two(int x)
rest of the code
endfunction
Different codes in the file
文件中的其他代码
函数void one(int x)
信息相同
代码的其余部分
端功能
文件中的其他代码
函数void othercheck::two(int x)
代码的其余部分
端功能
文件中的不同代码
您可以拥有任意数量的函数,并且不需要对它们进行排序-处理将仅应用于那些具有相对简单的
:
的函数-您需要的是迭代单词
列表(假设每个元素包含示例数据中的一行)并检查第二种“类型”函数的开头,然后去掉包含message_same
的行,直到遇到唯一的endfunction
,类似于:
# assuming `words` list with each line of your data
# if not it's as easy as: with open("input.txt") as f: words = [line for line in f]
with open("output.txt", "w") as f: # open output.txt for writing
in_function = False # an identifier to tell us we are within a `::` function
for line in words: # iterate over words
if in_function: # we are inside of a `::` function...
if line.strip() == "endfunction": # end of the function
in_function = False
elif "message_same" in line: # skip this line
continue
# detect function begin if there is "function" in the line followed with ::
elif "function" in line and line.find("function") < line.find("::"):
in_function = True
f.write(line) # write the line to the output file
# f.write("\n") # uncomment if the lines in your `words` are not terminated
它将生成包含以下内容的output.txt
:
Other codes in the file
function void one(int x)
message_same
rest of the code
endfunction
Other codes in the file
function void othercheck ::two(int x)
message_same
rest of the code
endfunction
Different codes in the file
Other codes in the file
function void one(int x)
message_same
rest of the code
endfunction
Other codes in the file
function void othercheck ::two(int x)
rest of the code
endfunction
Different codes in the file
文件中的其他代码
函数void one(int x)
信息相同
代码的其余部分
端功能
文件中的其他代码
函数void othercheck::two(int x)
代码的其余部分
端功能
文件中的不同代码
您可以拥有任意数量的函数,并且它们不需要排序-处理将仅应用于那些在文件中的每一行上迭代的函数;使用标志跟踪进程是否处于
:
函数中;使用该标志放弃相同的消息行;根据需要修改生产线;将行写入新文件
import re
special = re.compile(r'function.*?::')
in_special_func = False
with open(in_filepath) as in_file, open(out_filepath, 'w') as out_file:
for line in in_file:
if special.search(line):
in_special_func = True
if 'endfunction' in line:
in_special_func = False
if in_special_func and 'message_same' in line:
#skip
continue
# make line modifications here if needed
# line = modify(line)
# line = some_variation_of(line)
# print(line)
out_file.write(line)
前科
构造一个将捕获完整函数的正则表达式
f_re = re.compile(r'function.*?endfunction', flags = re.DOTALL)
构造一个正则表达式来标识特殊函数
special = re.compile(r'function.*?::')
构造与需要删除的行匹配的正则表达式
message_same = re.compile(r'^\s*message_same\s*\n', flags = re.MULTILINE)
将文件读入字符串:
with open(in_filepath) as in_file:
s = in_file.read()
迭代所有函数;如果功能是特殊的,则移除该行;对功能进行其他修改;将其写入文件
with open(out_filepath, 'w') as out_file:
for f in f_re.findall(s):
#print(f)
if special.search(f):
f = message_same.sub('', f)
# make other changes here
# assuming the result is a single string
out_file.write(f)
#print(f)
迭代文件中的每一行;使用标志跟踪进程是否处于:
函数中;使用该标志放弃相同的消息行;根据需要修改生产线;将行写入新文件
import re
special = re.compile(r'function.*?::')
in_special_func = False
with open(in_filepath) as in_file, open(out_filepath, 'w') as out_file:
for line in in_file:
if special.search(line):
in_special_func = True
if 'endfunction' in line:
in_special_func = False
if in_special_func and 'message_same' in line:
#skip
continue
# make line modifications here if needed
# line = modify(line)
# line = some_variation_of(line)
# print(line)
out_file.write(line)
前科
构造一个将捕获完整函数的正则表达式
f_re = re.compile(r'function.*?endfunction', flags = re.DOTALL)
构造一个正则表达式来标识特殊函数
special = re.compile(r'function.*?::')
构造与需要删除的行匹配的正则表达式
message_same = re.compile(r'^\s*message_same\s*\n', flags = re.MULTILINE)
将文件读入字符串:
with open(in_filepath) as in_file:
s = in_file.read()
迭代所有函数;如果功能是特殊的,则移除该行;对功能进行其他修改;将其写入文件
with open(out_filepath, 'w') as out_file:
for f in f_re.findall(s):
#print(f)
if special.search(f):
f = message_same.sub('', f)
# make other changes here
# assuming the result is a single string
out_file.write(f)
#print(f)
该文件是否只包含这两个函数,还是应该以迭代模式运行,在每一秒遇到的函数中删除message_same
?它包含几个这样的函数,其中两个函数只是函数的名称。两种函数类型之间的主要区别在于使用了“::”所以只有包含::
在他们的签名中应该去掉message\u same
如果遇到?是的,这是正确的,该文件是否只包含这两个函数,还是应该以迭代模式运行,在它遇到的每一秒函数中删除message\u same
?它包含几个这样的函数,其中两个函数只是函数的名称函数。两种函数之间的主要区别在于使用了“::”,因此只有在签名中包含:
的函数才会被剥离消息,如果遇到相同的?@zwer是的,没错,我实际上有单词=[]我的列表已经包含了所有内容,但我需要在函数中用“::”去掉“same_message”一行,并将消息保留在以virtual function开头的函数中。我更新了代码以处理函数签名中的“::”而不是“two”一词。根据您所描述的,您只有一个要求:从任何带有“:”的函数中删除“message_same”。在所有其他情况下,什么也不做。我相信我的回答可以做到这一点。还有其他需要做的事情吗?嗨,我的列表中有Words=[],已经有了所有内容,但是我需要在带有“:”的函数中去掉“same_message”一行,并将消息保留在以virtual function开头的函数中。我更新了代码,以处理函数签名中带有“::”的函数,而不是单词“two”。根据您所描述的,您只有一个要求:从任何带有“:”的函数中删除“message_same”。在所有其他情况下,什么也不做。我