Python 将生成器包装为具有单个“next”调用,而不是两个步骤(uu iter+;uu next uuu)
我从生成器处收到了未知数量的后台处理记录。如果有更重要的工作,我必须停下来发布流程Python 将生成器包装为具有单个“next”调用,而不是两个步骤(uu iter+;uu next uuu),python,python-3.x,generator,Python,Python 3.x,Generator,我从生成器处收到了未知数量的后台处理记录。如果有更重要的工作,我必须停下来发布流程 main过程最好描述为: def main(): generator_source = generator_for_test_data() # 1. contact server to get data. uw = UploadWrapper(generator_source) # 2. wrap the data. while not interrupt(): # 3. check
main
过程最好描述为:
def main():
generator_source = generator_for_test_data() # 1. contact server to get data.
uw = UploadWrapper(generator_source) # 2. wrap the data.
while not interrupt(): # 3. check for interrupts.
row = next(uw)
if row is None:
return
print(long_running_job(row)) # 4. do the work.
有没有一种方法可以不必插上\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu?
有两个步骤:(1)生成一个迭代器,然后(2)遍历它,这看起来很笨拙
在很多情况下,我更喜欢将函数提交给函数管理器(mapreduce样式),但在这种情况下,我需要一个带有一些设置的实例化类。因此,仅当单个函数是\uuuuu next\uuuu
class UploadWrapper(object):
def __init__(self, generator):
self.generator = generator
self._iterator = None
def __iter__(self):
for page in self.generator:
yield from page.data
def __next__(self):
if self._iterator is None: # ugly bit.
self._iterator = self.__iter__() #
try:
return next(self._iterator)
except StopIteration:
return None
Q:有更简单的方法吗?
为确保完整性,添加了工作样本:
import time
import random
class Page(object):
def __init__(self, data):
self.data = data
def generator_for_test_data():
for t in range(10):
page = Page(data=[(t, i) for i in range(100, 110)])
yield page
def long_running_job(row):
time.sleep(random.randint(1,10)/100)
assert len(row) == 2
assert row[0] in range(10)
assert row[1] in range(100, 110)
return row
def interrupt(): # interrupt check
if random.randint(1,50) == 1:
print("INTERRUPT SIGNAL!")
return True
return False
class UploadWrapper(object):
def __init__(self, generator):
self.generator = generator
self._iterator = None
def __iter__(self):
for ft in self.generator:
yield from ft.data
def __next__(self):
if self._iterator is None:
self._iterator = self.__iter__()
try:
return next(self._iterator)
except StopIteration:
return None
def main():
gen = generator_for_test_data()
uw = UploadWrapper(gen)
while not interrupt(): # check for job interrupt.
row = next(uw)
if row is None:
return
print(long_running_job(row))
if __name__ == "__main__":
main()
您的UploadWrapper
似乎过于复杂,有不止一个简单的解决方案
我的第一个想法是完全抛弃类,只使用函数:
def uploadwrapper(page_gen):
for page in page_gen:
yield from page.data
只要用uw=UploadWrapper(gen)
替换uw=UploadWrapper(gen)
,就行了
如果你坚持使用这个类,你可以去掉\uuu next\uuu()
,用uw=iter(UploadWrapper(gen))
替换uw=UploadWrapper(gen)
,它就可以工作了
在这两种情况下,您还必须捕获调用者中的StopIteration
<代码>\uuuu next\uuuuu()
应该在完成时引发StopIteration
,而不是像您的那样返回None
。否则,它将无法处理期望性能良好的迭代器的事情,例如循环的
我想你可能对这一切应该如何结合在一起有一些误解,因此我将尽我所能解释它应该如何工作,就我所知:
\uuuu iter\uuuu()
的要点是,如果你有一个列表,你可以通过调用iter()
获得多个独立的迭代器。当你有一个for
循环时,你基本上首先得到一个带有iter()
的迭代器,然后在每次循环迭代中对其调用next()
。如果有两个使用相同列表的嵌套循环,迭代器及其位置仍然是分开的,因此没有冲突\uuu iter\uuu()
应该为其所在的容器返回一个迭代器,或者如果在迭代器上调用它,则应该只返回self
。从这个意义上讲,UploadWrapper
在\uuuu iter\uuuu()
中不返回self
是错误的,因为它包装了一个生成器,因此不能真正提供独立的迭代器。至于为什么省略\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu。在您的原始代码中,您根本没有使用\uuuu iter\uuuu()
来实现它的预期用途:即使您将其重命名为其他名称,代码也可以工作!这是因为您从未对实例调用iter()
,而是直接调用next()
如果您想作为一个类“正确地”完成它,我认为这样做可能就足够了:
class UploadWrapper(object):
def __init__(self, generator):
self.generator = generator
self.subgen = iter(next(generator).data)
def __iter__(self):
return self
def __next__(self):
while True:
try:
return next(self.subgen)
except StopIteration:
self.subgen = iter(next(self.generator).data)
你应该完全摆脱\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu。我希望我能在python文档中找到它。@root-11我想,在和中也简要地介绍了它;找到正确的文档有时会异常困难。我个人喜欢每隔几年重新阅读,或者至少浏览一下官方的Python教程——每次我都会学到一些新的东西,要么是因为我在以前的时候错过了一些细节,要么是因为在此期间引入了新的特性。(本教程一开始可能看起来有点基本/简单/缓慢,因为它涵盖了一些简单的事情,如加法和for
循环,但它对Python作为一个整体进行了相当深入的介绍,甚至简单的部分也值得一读,因为您可能会惊讶地了解到一些新的东西,您可以发誓您知道这些东西,比如后面的内容例如,不是每个人都知道前面提到的For
循环可以在Python中有一个else
子句!我敢打赌大多数人都会认为“当然,我知道关于For
循环的一切,即使他们不知道For else
)@root-11:说到不总是能找到你要找的文档,我想我也想把你链接到,但不记得页面名,结果那时候找不到。不管怎样,我只是在寻找其他东西的时候又偶然发现了它,我想我现在不妨链接一下!它真的很方便,因为它有很多文档<代码>\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
方法放在一个地方。(它还明确说明了第一个链接仅暗示的内容,例如,迭代器的\uuuuuuuuuuuuuuuu