Regex 使用python在[-和-]之间提取文本_Regex_Python 2.7

Regex 使用python在[-和-]之间提取文本

regex python-2.7

Regex 使用python在[-和-]之间提取文本,regex,python-2.7,Regex,Python 2.7,我正在编写一个脚本，从一个文件中提取数据，并将数据拆分为多个文件。每个文件的内容拆分为5“@” 例如： @@@@@ hello @@@@@ world @@@@@ 在这种情况下，“hello”应该在一个文件中，“world”应该在另一个文件中我正在使用python，这可以实现以下目的： with open('a.txt') as r: #open source file and assign it to variable r r = r.read().split('@@@@@

我正在编写一个脚本，从一个文件中提取数据，并将数据拆分为多个文件。每个文件的内容拆分为5“@”

例如：

@@@@@

hello

@@@@@

world

@@@@@

在这种情况下，“hello”应该在一个文件中，“world”应该在另一个文件中

我正在使用python，这可以实现以下目的：

with open('a.txt') as r: #open source file and assign it to variable r
    r = r.read().split('@@@@@') #read the contents and break it into list of elements separated by '@@@@@'
    new = [item.strip() for item in r if item] #clean empty rows from the list

for i, item in enumerate(new): #iterate trough new list and assign a number to each iteration starting with 0 (default)
    with open('a%s.txt' % i+1, 'w') as w: #create new file for each element from the list that will be named 'a' + 'value of i + 1' + '.txt'
        w.write(item) #writing contents of current element into file

这将读取我称为“a.txt”的文件，并生成名为

a1.txt、a2.txt的文件。。。如果我正确理解了您的需求，那么您希望能够从带有分隔符的文件中获取输入@@@@@
@@@@@
hello
@@@@@
world
@@@@@

这将为之间的每个块生成一个文件
hello

及
您可以使用re.split来获取拆分
splits = re.split("[@]{5}\n", input_buffer)

将给出类似的内容（注：以上假设拆分还包括换行符）
和仅获取实际文本的拆分（假设要删除尾随新行）
输出文件名也未指定，因此未使用
for index, val in enumerate([i.strip() for i in splits if i]):
    with open("output%d"%index, "w+") as f:

要创建名为output0的文件，请选择outputN
import re
import StringIO

input_text = '''@@@@@
hello
@@@@@
world
@@@@@
'''
string_file =  StringIO.StringIO(input_text)
input_buffer = string_file.read()

splits = re.split("[@]{5}\n", input_buffer)
for index, val in enumerate([i.strip() for i in splits if i]):
    with open("output%d"%index, "w+") as f:
        f.write(val)

仅仅是一个助手，显然可以使用不同的正则表达式进行拆分，将输出名称更改为更合适的名称，等等
同样，如果正如这个问题的标题所说，使用[-和-]拆分之间的文本可以使用re.findall代替
input_text = '''[-hello-]
[-world-]
'''
string_file =  StringIO.StringIO(input_text)

input_buffer = string_file.read()
splits = re.findall("\[-(.*)-\]", input_buffer)
for index, val in enumerate(splits):
    with open("output%d"%index, "w+") as f:
        f.write(val)

请向我们展示您现在拥有的代码您在程序的哪个部分遇到问题？非常确定[@]{5}\n
与最后的“@@@@@”不匹配。也许更好：[@]{5}\n？
或者干脆删除换行符，让strip（）完成工作。@brianpck是正确的，我假设换行符终止了文件，@nijeeshjosh我在每一行都添加了注释。希望它能澄清问题。
for index, val in enumerate([i.strip() for i in splits if i]):
    with open("output%d"%index, "w+") as f:

import re
import StringIO

input_text = '''@@@@@
hello
@@@@@
world
@@@@@
'''
string_file =  StringIO.StringIO(input_text)
input_buffer = string_file.read()

splits = re.split("[@]{5}\n", input_buffer)
for index, val in enumerate([i.strip() for i in splits if i]):
    with open("output%d"%index, "w+") as f:
        f.write(val)

input_text = '''[-hello-]
[-world-]
'''
string_file =  StringIO.StringIO(input_text)

input_buffer = string_file.read()
splits = re.findall("\[-(.*)-\]", input_buffer)
for index, val in enumerate(splits):
    with open("output%d"%index, "w+") as f:
        f.write(val)