Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 减少使用'获得的iterable中的计数器;枚举';打过电话后_Python_Python 3.x_Seek_Enumerate_Tell - Fatal编程技术网

Python 减少使用'获得的iterable中的计数器;枚举';打过电话后

Python 减少使用'获得的iterable中的计数器;枚举';打过电话后,python,python-3.x,seek,enumerate,tell,Python,Python 3.x,Seek,Enumerate,Tell,我正在使用Python读取一个文件,文件中有用“#”字符括起来的部分: #HEADER1, SOME EXTRA INFO data first section 1 2 1 233 ... // THIS IS A COMMENT #HEADER2, SECOND SECTION 452 134 // ANOTHER COMMENT ... #HEADER3, THIRD SECTION 现在,我编写了如下代码来读取该文件: with open(filename) as fh: e

我正在使用Python读取一个文件,文件中有用“#”字符括起来的部分:

#HEADER1, SOME EXTRA INFO
data first section
1 2
1 233 
...
// THIS IS A COMMENT
#HEADER2, SECOND SECTION
452
134
// ANOTHER COMMENT
...
#HEADER3, THIRD SECTION
现在,我编写了如下代码来读取该文件:

with open(filename) as fh:

    enumerated = enumerate(iter(fh.readline, ''), start=1)

    for lino, line in enumerated:

        # handle special section
        if line.startswith('#'):

            print("="*40)
            print(line)

            while True:

                start = fh.tell()
                lino, line = next(enumerated)

                if line.startswith('#'):
                    fh.seek(start)
                    break

                print("[{}] {}".format(lino,line))
# create enumerated object
e = EnumeratedFile(fh)

header = ""
for lineno, line, in e:

    print("[{}] {}".format(lineno, line))

    header = line.rstrip()

    # HEADER1
    if header.startswith("#HEADER1"):

        # process header 1 lines
        while e.section():

            # get node line
            lineno, line = next(e)
            # do whatever needs to be done with the line

     elif header.startswith("#HEADER2"):

         # etc.
输出为:

========================================
#HEADER1, SOME EXTRA INFO

[2] data first section

[3] 1 2

[4] 1 233 

[5] ...

[6] // THIS IS A COMMENT

========================================
#HEADER2, SECOND SECTION

[9] 452

[10] 134

[11] // ANOTHER COMMENT

[12] ...

========================================
#HEADER3, THIRD SECTION
现在您看到行计数器
lino
不再有效,因为我正在使用
seek
。此外,在中断循环之前减少它也无济于事,因为每次调用
next
时,该计数器都会增加。那么,在Python3.x中有没有一种优雅的方法来解决这个问题呢?另外,是否有更好的方法来解决
StopIteration
,而无需在
块中放入
pass
语句

更新

到目前为止,我已经根据@Dunes提出的建议采用了一个实现。我不得不把它改了一点,这样我就可以向前看,看看是否有一个新的部分开始了。我不知道是否有更好的方法,所以请发表评论:

类枚举文件:

    def __init__(self, fh, lineno_start=1):
        self.fh = fh
        self.lineno = lineno_start

    def __iter__(self):
        return self

    def __next__(self):
        result = self.lineno, self.fh.readline()
        if result[1] == '':
            raise StopIteration

        self.lineno += 1
        return result

    def mark(self):
        self.marked_lineno = self.lineno
        self.marked_file_position = self.fh.tell()

    def recall(self):
        self.lineno = self.marked_lineno
        self.fh.seek(self.marked_file_position)

    def section(self):
        pos = self.fh.tell()
        char = self.fh.read(1)
        self.fh.seek(pos)
        return char != '#'
然后读取文件并按如下方式处理每个部分:

with open(filename) as fh:

    enumerated = enumerate(iter(fh.readline, ''), start=1)

    for lino, line in enumerated:

        # handle special section
        if line.startswith('#'):

            print("="*40)
            print(line)

            while True:

                start = fh.tell()
                lino, line = next(enumerated)

                if line.startswith('#'):
                    fh.seek(start)
                    break

                print("[{}] {}".format(lino,line))
# create enumerated object
e = EnumeratedFile(fh)

header = ""
for lineno, line, in e:

    print("[{}] {}".format(lineno, line))

    header = line.rstrip()

    # HEADER1
    if header.startswith("#HEADER1"):

        # process header 1 lines
        while e.section():

            # get node line
            lineno, line = next(e)
            # do whatever needs to be done with the line

     elif header.startswith("#HEADER2"):

         # etc.

不能更改
enumerate()
iterable的计数器,否

你根本不需要在这里,也不需要寻找。而是使用嵌套循环并缓冲节标题:

with open(filename) as fh:
    enumerated = enumerate(fh, start=1)
    header = None
    for lineno, line in enumerated:
        # seek to first section
        if header is None:
            if not line.startswith('#'):
                continue
            header = line

        print("=" * 40)
        print(header.rstrip())
        for lineno, line in enumerated:
            if line.startswith('#'):
                # new section
                header = line
                break

            # section line, handle as such
            print("[{}] {}".format(lineno, line.rstrip()))
这只缓冲标题行;每次我们遇到一个新的头,它就会被存储,当前的节循环就结束了

演示:


第三部分保持未处理状态,因为其中没有行,但如果有行,则已预先设置了
标题
变量。

您无法更改
枚举()的计数器

你根本不需要在这里,也不需要寻找。而是使用嵌套循环并缓冲节标题:

with open(filename) as fh:
    enumerated = enumerate(fh, start=1)
    header = None
    for lineno, line in enumerated:
        # seek to first section
        if header is None:
            if not line.startswith('#'):
                continue
            header = line

        print("=" * 40)
        print(header.rstrip())
        for lineno, line in enumerated:
            if line.startswith('#'):
                # new section
                header = line
                break

            # section line, handle as such
            print("[{}] {}".format(lineno, line.rstrip()))
这只缓冲标题行;每次我们遇到一个新的头,它就会被存储,当前的节循环就结束了

演示:


第三部分保持未处理状态,因为其中没有行,但如果有行,则已预先设置了
标题
变量。

您无法更改
枚举()的计数器

你根本不需要在这里,也不需要寻找。而是使用嵌套循环并缓冲节标题:

with open(filename) as fh:
    enumerated = enumerate(fh, start=1)
    header = None
    for lineno, line in enumerated:
        # seek to first section
        if header is None:
            if not line.startswith('#'):
                continue
            header = line

        print("=" * 40)
        print(header.rstrip())
        for lineno, line in enumerated:
            if line.startswith('#'):
                # new section
                header = line
                break

            # section line, handle as such
            print("[{}] {}".format(lineno, line.rstrip()))
这只缓冲标题行;每次我们遇到一个新的头,它就会被存储,当前的节循环就结束了

演示:


第三部分保持未处理状态,因为其中没有行,但如果有行,则已预先设置了
标题
变量。

您无法更改
枚举()的计数器

你根本不需要在这里,也不需要寻找。而是使用嵌套循环并缓冲节标题:

with open(filename) as fh:
    enumerated = enumerate(fh, start=1)
    header = None
    for lineno, line in enumerated:
        # seek to first section
        if header is None:
            if not line.startswith('#'):
                continue
            header = line

        print("=" * 40)
        print(header.rstrip())
        for lineno, line in enumerated:
            if line.startswith('#'):
                # new section
                header = line
                break

            # section line, handle as such
            print("[{}] {}".format(lineno, line.rstrip()))
这只缓冲标题行;每次我们遇到一个新的头,它就会被存储,当前的节循环就结束了

演示:


第三部分仍然未处理,因为其中没有行,但如果有行,
标题
变量已预先设置。

您可以复制迭代器,然后从该副本还原迭代器。但是,不能复制文件对象。您可以获取枚举数的浅层副本,然后在开始使用复制的枚举数时查找文件的相应部分

但是,最好的方法是编写生成器类,使用
\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu

class EnumeratedFile:

    def __init__(self, fh, lineno_start=1):
        self.fh = fh
        self.lineno = lineno_start

    def __iter__(self):
        return self

    def __next__(self):
        result = self.lineno, next(self.fh)
        self.lineno += 1
        return result

    def mark(self):
        self.marked_lineno = self.lineno
        self.marked_file_position = self.fh.tell()

    def recall(self):
        self.lineno = self.marked_lineno
        self.fh.seek(self.marked_file_position)
您可以这样使用它:

from io import StringIO
demo = StringIO('''\
#HEADER1, SOME EXTRA INFO
data first section
1 2
1 233 
...
// THIS IS A COMMENT
#HEADER2, SECOND SECTION
452
134
// ANOTHER COMMENT
...
#HEADER3, THIRD SECTION
''')

e = EnumeratedFile(demo)
seen_header2 = False
for lineno, line, in e:
    if seen_header2:
        print(lineno, line)
        assert (lineno, line) == (2, "data first section\n")
        break
    elif line.startswith("#HEADER1"):
        e.mark()
    elif line.startswith("#HEADER2"):
        e.recall()
        seen_header2 = True

您可以复制迭代器,然后从该副本还原迭代器。但是,不能复制文件对象。您可以获取枚举数的浅层副本,然后在开始使用复制的枚举数时查找文件的相应部分

但是,最好的方法是编写生成器类,使用
\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu

class EnumeratedFile:

    def __init__(self, fh, lineno_start=1):
        self.fh = fh
        self.lineno = lineno_start

    def __iter__(self):
        return self

    def __next__(self):
        result = self.lineno, next(self.fh)
        self.lineno += 1
        return result

    def mark(self):
        self.marked_lineno = self.lineno
        self.marked_file_position = self.fh.tell()

    def recall(self):
        self.lineno = self.marked_lineno
        self.fh.seek(self.marked_file_position)
您可以这样使用它:

from io import StringIO
demo = StringIO('''\
#HEADER1, SOME EXTRA INFO
data first section
1 2
1 233 
...
// THIS IS A COMMENT
#HEADER2, SECOND SECTION
452
134
// ANOTHER COMMENT
...
#HEADER3, THIRD SECTION
''')

e = EnumeratedFile(demo)
seen_header2 = False
for lineno, line, in e:
    if seen_header2:
        print(lineno, line)
        assert (lineno, line) == (2, "data first section\n")
        break
    elif line.startswith("#HEADER1"):
        e.mark()
    elif line.startswith("#HEADER2"):
        e.recall()
        seen_header2 = True

您可以复制迭代器,然后从该副本还原迭代器。但是,不能复制文件对象。您可以获取枚举数的浅层副本,然后在开始使用复制的枚举数时查找文件的相应部分

但是,最好的方法是编写生成器类,使用
\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu

class EnumeratedFile:

    def __init__(self, fh, lineno_start=1):
        self.fh = fh
        self.lineno = lineno_start

    def __iter__(self):
        return self

    def __next__(self):
        result = self.lineno, next(self.fh)
        self.lineno += 1
        return result

    def mark(self):
        self.marked_lineno = self.lineno
        self.marked_file_position = self.fh.tell()

    def recall(self):
        self.lineno = self.marked_lineno
        self.fh.seek(self.marked_file_position)
您可以这样使用它:

from io import StringIO
demo = StringIO('''\
#HEADER1, SOME EXTRA INFO
data first section
1 2
1 233 
...
// THIS IS A COMMENT
#HEADER2, SECOND SECTION
452
134
// ANOTHER COMMENT
...
#HEADER3, THIRD SECTION
''')

e = EnumeratedFile(demo)
seen_header2 = False
for lineno, line, in e:
    if seen_header2:
        print(lineno, line)
        assert (lineno, line) == (2, "data first section\n")
        break
    elif line.startswith("#HEADER1"):
        e.mark()
    elif line.startswith("#HEADER2"):
        e.recall()
        seen_header2 = True

您可以复制迭代器,然后从该副本还原迭代器。但是,不能复制文件对象。您可以获取枚举数的浅层副本,然后在开始使用复制的枚举数时查找文件的相应部分

但是,最好的方法是编写生成器类,使用
\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu

class EnumeratedFile:

    def __init__(self, fh, lineno_start=1):
        self.fh = fh
        self.lineno = lineno_start

    def __iter__(self):
        return self

    def __next__(self):
        result = self.lineno, next(self.fh)
        self.lineno += 1
        return result

    def mark(self):
        self.marked_lineno = self.lineno
        self.marked_file_position = self.fh.tell()

    def recall(self):
        self.lineno = self.marked_lineno
        self.fh.seek(self.marked_file_position)
您可以这样使用它:

from io import StringIO
demo = StringIO('''\
#HEADER1, SOME EXTRA INFO
data first section
1 2
1 233 
...
// THIS IS A COMMENT
#HEADER2, SECOND SECTION
452
134
// ANOTHER COMMENT
...
#HEADER3, THIRD SECTION
''')

e = EnumeratedFile(demo)
seen_header2 = False
for lineno, line, in e:
    if seen_header2:
        print(lineno, line)
        assert (lineno, line) == (2, "data first section\n")
        break
    elif line.startswith("#HEADER1"):
        e.mark()
    elif line.startswith("#HEADER2"):
        e.recall()
        seen_header2 = True

您无法重置
enumerate()
计数,否。无论如何,混合搜索和迭代不是一个好主意。这里的目标是什么?要对每个部分中的行进行编号,每个新部分从1开始?目的是提醒用户输入文件中的某些行号有问题,以防在读取时出错。我可以用计数器替换enumerate,每次调用next时增加它,每次调用seek时找到新分区时减少它。我不知道为什么需要查找。为什么不将读取的行存储在缓冲区中呢?我不想