Python 图的节点计数不匹配
我有一个MDB数据库,其中包含关于论坛帖子的以下属性:Python 图的节点计数不匹配,python,mongodb,social-networking,data-visualization,networkx,Python,Mongodb,Social Networking,Data Visualization,Networkx,我有一个MDB数据库,其中包含关于论坛帖子的以下属性: thread author (posted in the thread) children (a list of authors who replied to the post) child_count (number of children in the list) 我正在尝试构建包含以下节点的图形: thread author child authors 在我的数据库中,不同作者的总数超过30000人,但在作者计数时生成的图形大约为
thread
author (posted in the thread)
children (a list of authors who replied to the post)
child_count (number of children in the list)
我正在尝试构建包含以下节点的图形:
thread
author
child authors
在我的数据库中,不同作者的总数超过30000人,但在作者计数时生成的图形大约为3000人。或者,在总共33000个节点中,以下代码生成大约5000个节点。这是怎么回事
for doc in coll.find():
thread = doc['thread'].encode('utf-8')
author_parent = doc['author'].encode('utf-8')
children = doc['children']
children_count = len(children)
#print G.nodes()
#print post_parent, author, doc['thread']
try:
if thread in G:
continue
else:
G.add_node(thread, color='red')
thread_count+=1
if author_parent in G:
G.add_edge(author_parent, thread)
else:
G.add_node(author_parent, color='green')
G.add_edge(author_parent, thread, weight=0)
author_count+=1
if doc['child_count']!=0:
for doc in children:
if doc['author'].encode("utf-8") in G:
print doc['author'].encode("utf-8"), 'in G'
G.add_edge(doc['author'].encode("utf-8"), author_parent)
else:
G.add_node(doc['author'].encode("utf-8"),color='green')
G.add_edge(doc['author'].encode("utf-8"), author_parent, weight=0)
author_count+=1
except:
print "failed"
nx.write_dot(G,PATH)
print thread_count, author_count, children_count
我得到了答案。continue语句正在跳到下一次迭代,因此我丢失了许多节点。您确定
coll.find()
返回30000个结果吗?@brice coll.find()返回的结果超过500000个。不同的作者大约有30000人。