python couchdb寻呼机达到递归深度限制
我正在创建一个寻呼机,该寻呼机从中的apachecouchdb映射函数返回文档。此生成器表达式工作正常,直到达到最大递归深度。如何改进迭代,而不是递归python couchdb寻呼机达到递归深度限制,python,couchdb,Python,Couchdb,我正在创建一个寻呼机,该寻呼机从中的apachecouchdb映射函数返回文档。此生成器表达式工作正常,直到达到最大递归深度。如何改进迭代,而不是递归 def page(db, view_name, limit, include_docs=True, **opts): """ `page` goes returns all documents of CouchDB map functions. It accepts all options that `couchdb.Da
def page(db, view_name, limit, include_docs=True, **opts):
"""
`page` goes returns all documents of CouchDB map functions. It accepts
all options that `couchdb.Database.view` does, however `include_docs`
should be omitted, because this will interfere with things.
>>> import couchdb
>>> db = couchdb.Server()['database']
>>> for doc in page(db, '_all_docs', 100):
>>> doc
#etc etc
>>> del db['database']
Notes on implementation:
- `last_doc` is assigned on every loop, because there doesn't seem to
be an easy way to know if something is the last item in the iteration.
"""
last_doc = None
for row in db.view(view_name,
limit=limit+1,
include_docs=include_docs,
**opts):
last_doc = row.key, row.id
yield row.doc
if last_doc:
for doc in page(db, view_name, limit,
inc_docs=inc_docs,
startkey=last_doc[0],
startkey_docid=last_doc[1]):
yield doc
这里有一些东西可以让你开始。您没有指定
*选项可能是什么;如果只需要startkey和startkey_docid来启动递归,而不需要其他字段,那么就可以去掉额外的函数
显然,没有经过测试
def page_key(db, view_name, limit, startkey, startkey_docid, inc_docs=True):
queue = [(startkey, startkey_docid)]
while queue:
key = queue.pop()
last_doc = None
for row in db.view(view_name,
limit=limit+1,
include_docs=inc_docs,
startkey=key[0],
startkey_docid=key[1]):
last_doc = row.key, row.id
yield row.doc
if last_doc:
queue.append(last_doc)
def page(db, view_name, limit, inc_docs=True, **opts):
last_doc = None
for row in db.view(view_name,
limit=limit+1,
include_docs=inc_docs,
**opts):
last_doc = row.key, row.id
yield row.doc
if last_doc:
for doc in page_key(db, view_name, limit, last_doc[0], last_doc[1], inc_docs):
yield doc
这是一种替代方法,我已经在一个拥有>800k文档的数据库上(手动)测试过。似乎有效
def page2(db, view_name, limit, inc_docs=True, **opts):
def get_batch(db=db, view_name=view_name, limit=limit, inc_docs=inc_docs, **opts):
for row in db.view(view_name, limit=limit+1, include_docs=inc_docs, **opts):
yield row
last_doc = None
total_rows = db.view(view_name, limit=1).total_rows
batches = (total_rows / limit) + 1
for i in xrange(batches):
if not last_doc:
for row in get_batch():
last_doc = row.key, row.id
yield row.doc or row # if include_docs is False,
# row.doc will be None
else:
for row in get_batch(startkey=last_doc[0],
startkey_docid=last_doc[1]):
last_doc = row.key, row.id
yield row.doc or row
我不使用CouchDB,所以我在理解示例代码时遇到了一些困难。这是一个精简版,我相信它会按照您的要求运行:
all_docs = range(0, 100)
def view(limit, offset):
print "view: returning", limit, "rows starting at", offset
return all_docs[offset:offset+limit]
def generate_by_pages(page_size):
offset = 0
while True:
rowcount = 0
for row in generate_page(page_size, offset):
rowcount += 1
yield row
if rowcount == 0:
break
else:
offset += rowcount
def generate_page(page_size, offset):
for row in view(page_size, offset):
yield row
for r in generate_by_pages(10):
print r
关键是用迭代代替递归。有很多方法可以做到这一点(我喜欢Python中的蹦床),但上面的方法很简单 我看不懂这个密码。我不太喜欢PEP8鹦鹉学舌,但请至少使用4个空格的缩进。这并不能真正回答问题,但一个有用的提示是,您可以使用sys.setrecursionlimit()
Thank@Rafe,我知道,但因为我返回了数十万行的结果,我不想杀死计算机。@格伦,我已经更新了代码格式,并添加了一个doctest来解释函数如何运行是的,我真的很喜欢sys
模块。我们将在春天结婚。