Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/android/191.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Python中分割具有单独位置标记的文件的更好方法_Python - Fatal编程技术网

在Python中分割具有单独位置标记的文件的更好方法

在Python中分割具有单独位置标记的文件的更好方法,python,Python,我有以下类型的文件: --- part0 --- some strings --- part1 --- some other strings --- part2 --- ... 我希望以python列表的形式获取文件的任何部分: x = get_part_of_file(part=0) print x # => should print ['some', 'strings'] x = get_part_of_file(part=1) print x # => should prin

我有以下类型的文件:

--- part0 ---
some
strings
--- part1 ---
some other
strings
--- part2 ---
...
我希望以python列表的形式获取文件的任何部分:

x = get_part_of_file(part=0)
print x # => should print ['some', 'strings']
x = get_part_of_file(part=1)
print x # => should print ['some other', 'strings']
所以,我的问题是实现上面使用的
get\u part\u文件
方法的最简单方法

我的(丑陋的)解决方案如下:

def get_part_of_file(part, separate_str="part"):
    def does_match_to_separate(line):
        return re.compile("{}.*{}".format(separate_str, part)).match(line)
    def get_first_line_num_appearing_separate_str(lines):
        return len(list(end_of_loop() if does_match_to_separate(line, part) else line for line in lines))

    with open("my_file.txt") as f:
      lines = f.readlines()

    # get first line number of the required part
    first_line_num = get_first_line_num_appearing_separate_str(part)
    # get last line number of the required part
    last_line_num = get_first_line_num_appearing_separate_str(part + 1) - 1  
    return lines[first_line_num:last_line_num]

您可以使用正则表达式来解析字符串。请看下面的示例,并在以下设备上试用:

您可能遇到的唯一问题是,目前正则表达式模式只包含字符、空格和换行符
\w\s
。如果零件的值中有其他字符,则必须扩展此模式以匹配更多字符。

使用可以编写如下内容

>>> input_file = open('input', 'r')
>>> content = input_file.read()
>>> content_parts = re.split('.+?part\d+.+?\n', content)

>>> content_parts
['', 'some\nstrings\n', 'some other\nstrings\n', '']

>>> [ part.split('\n') for part in content_parts if part ]
[['some', 'strings', ''], ['some other', 'strings', '']]
import re
parts = re.finditer(your_regex_pattern, text)

for p in parts:
   print("Part %s: %s" % (p.group('part_number'), p.group('part_value'))
   # or return the element with the part-number you want.
>>> input_file = open('input', 'r')
>>> content = input_file.read()
>>> content_parts = re.split('.+?part\d+.+?\n', content)

>>> content_parts
['', 'some\nstrings\n', 'some other\nstrings\n', '']

>>> [ part.split('\n') for part in content_parts if part ]
[['some', 'strings', ''], ['some other', 'strings', '']]