Rest 使用大量可选关系优化Cypher查询
我在Neo4j 2.0.1的批处理REST API上使用Cypher 我正在尝试优化我的查询,这些查询有很多可选关系。我想一次检索所有数据,以限制往返数据库的次数。尽管我的数据库中只有大约12000个节点,但查询已经开始爬网(有些查询需要1.5秒才能返回1000个节点) 我已经建立了一个图表要点,详细介绍了 我的查询通常采用以下形式:Rest 使用大量可选关系优化Cypher查询,rest,neo4j,cypher,Rest,Neo4j,Cypher,我在Neo4j 2.0.1的批处理REST API上使用Cypher 我正在尝试优化我的查询,这些查询有很多可选关系。我想一次检索所有数据,以限制往返数据库的次数。尽管我的数据库中只有大约12000个节点,但查询已经开始爬网(有些查询需要1.5秒才能返回1000个节点) 我已经建立了一个图表要点,详细介绍了 我的查询通常采用以下形式: MATCH (u:user { id: "u1" }) WITH u MATCH u-[:CONTACT]->(c:contact) WITH u, c
MATCH (u:user { id: "u1" })
WITH u
MATCH u-[:CONTACT]->(c:contact)
WITH u, c
OPTIONAL MATCH (c)-[:CREATED]->(xca:activity)<-[:USERACTIVITY]-(xcc:contact)
OPTIONAL MATCH (c)-[:HISTORY]->(xcu:activity)<-[:USERACTIVITY]-(xuc:contact)
OPTIONAL MATCH (c)-[:PHONE]->(xp:phone)
OPTIONAL MATCH (c)-[:ADDRESS]->(xa:address)
OPTIONAL MATCH (u)-[:PHONE]->(xup:phone)
OPTIONAL MATCH (u)-[:ADDRESS]->(xua:address)
WITH DISTINCT c AS x, u,
COLLECT(DISTINCT xp) AS xps,
COLLECT(DISTINCT xa) AS xas,
COLLECT(DISTINCT xup) AS xups,
COLLECT(DISTINCT xua) AS xuas,
xca.createdat AS createdat,
xcu.createdat AS updatedat,
{id: xcc.id} AS createdby,
{id: xuc.id} AS updatedby
RETURN COLLECT({
id: x.id,
name: COALESCE(u.name, x.name),
createdat: createdat,
createdby: createdby,
updatedat: updatedat,
updatedby: updatedby,
phones: (CASE WHEN size(xps)= 0
THEN NULL
ELSE [xp IN xps | { id: xp.id, number: xp.number}]
END),
userphones: (CASE WHEN size(xups)= 0
THEN NULL
ELSE [xup IN xups | { id: xup.id, number: xup.number }]
END),
addresses: (CASE WHEN size(xas)= 0
THEN NULL
ELSE [xa IN xas | { id: xa.id, city: xa.city}]
END),
useraddresses: (CASE WHEN size(xuas)= 0
THEN NULL
ELSE [xua IN xuas | { id: xua.id, city: xua.city}]
END)
}) AS r
我的数据库有6000个节点和12000个关系,此查询返回1000个联系人(整个数据库的大小为7MB)。对于这种类型的查询,是否需要将近400毫秒
我真的很感激你能帮我检查数据库,但我想我真的很想知道如何自己诊断这些问题。当我使用web UI时,我没有看到爆炸(每个结果只返回一行)。当我使用PROFILE命令时,我看不到像您预期的数百万的数字
是否有其他工具可用于诊断性能问题?是否有某种调试器来跟踪问题?问题在于您在所有匹配项之间创建了交叉乘积 如果您可以识别最多有一个连接的匹配项,您可以将它们提前拉出来。否则,您可以收集匹配的信息以返回基数1(或#个联系人ftm) e、 g
MATCH(u:user{id:“u1”})
可选匹配(u)-[:PHONE]->(xup:PHONE)
可选匹配(u)-[:地址]->(xua:地址)
//基数1
使用u,collect(不同的xup)作为电话,collect(不同的xua)作为地址
匹配(u)-[:触点]->(c:触点)
有u,c,电话,地址
可选匹配(c)-[:已创建]->(xca:activity)(xup:phone)
可选匹配(u)-[:地址]->(xua:地址)
//基数1
使用u,{user:u,phones:collect(distinct xup),addresses:collect(distinct xua)}作为用户信息
匹配(u)-[:触点]->(c:触点)
使用c,用户信息
可选匹配(c)-[:CREATED]->(xca:activity)谢谢Michael,我继续进行并中断了查询,确保每个with语句只返回一行。但是,现在查询速度变慢了,而且当我对较大的数据集进行测试时,它实际上永远不会完成。我已经用我的完整查询更新了我的问题,也许我误解了你的评论?你没有告诉我们你正在进行可变长度的匹配。您也没有将电话+地址拖到开头。如果您有这些var-length匹配项,您必须将它们与fragment分离成自己的匹配项,否则就会发生问题。试着一步一步地构建查询,看看什么时候爆发。你看到我的图表列表了吗?我认为你更新的查询的复杂度是第一个查询的10倍,因为你更深入地研究了图表,还解析了很多附加信息。如果你能分享你的数据库,我很乐意看一看,但我想你接触了数百万条路径。它没有返回的原因是因为打字错误,第6行和第7行上的方向箭头丢失,所以链接变成了单向链接。使它们具有方向性允许查询完成,但仍需要2秒以上的时间。我已经使用Web UI逐步完成了查询的每个部分,每个步骤只返回一行。使用“纵断面”命令不会显示路径爆炸的迹象。UserActivity节点在我搜索的方向上只有1个关系,所以(:activity)我已经更新了问题并从查询中删除了所有可选的匹配项,但在350毫秒内仍然无法获得1000个结果。我还从以前的代码中重新执行了查询,每个WITH语句都通过Web UI返回1个结果。
MATCH (u:user { id: "123" })
WITH u
MATCH (u)-[:CONTACT]->(c:contact)
WITH c
OPTIONAL MATCH
(c)-[:CREATED]->(xca:activity)-[:USERACTIVITY*1..4]<-(xcc:contact),
(c)-[:HISTORY]->(xcu:activity)-[:USERACTIVITY*1..4]<-(xuc:contact)
WITH c AS x,
xca.createdat AS createdat, xcu.createdat AS updatedat,
{id: xcc.id, object: xcc.object} AS createdby,
{id: xuc.id, object: xuc.object} AS updatedby
OPTIONAL MATCH
(x)-[:PHONE]->(xp:phone)
WITH x, createdat, updatedat, createdby, updatedby,
COLLECT(xp) as xps
OPTIONAL MATCH
(x)-[:ADDRESS]->(xa:address)
WITH x, createdat, updatedat, createdby, updatedby, xps,
COLLECT(xa) as xas
OPTIONAL MATCH (xu:user)-[:CONTACT]->(x)
OPTIONAL MATCH (xu)-[:PHONE]->(xup:phone)
WITH x, createdat, updatedat, createdby, updatedby, xps, xas,
xu, COLLECT(xup) as xups
OPTIONAL MATCH (xu)-[:ADDRESS]->(xua:address)
WITH x, createdat, updatedat, createdby, updatedby, xps, xas,
xu, xups, COLLECT(xua) as xuas
RETURN COLLECT({
id: x.id,
object: x.object,
status: x.status,
teamid: x.teamid,
name: COALESCE(xu.name, x.name),
displayname: COALESCE(xu.displayname, x.displayname),
email: COALESCE(xu.email, x.email),
imageurl: COALESCE(xu.imageurl, x.imageurl),
workhours: x.workhours,
notes: x.notes,
company: x.company,
createdat: createdat,
createdby: createdby,
updatedat: updatedat,
updatedby: updatedby,
isuser: (NOT xu IS NULL),
phones: (CASE WHEN size(xps)= 0
THEN NULL
ELSE [xp IN xps | { id: xp.id, object: xp.object,
number: xp.number, description: xp.description }]
END),
userphones: (CASE WHEN size(xups)= 0
THEN NULL
ELSE [xup IN xups | { id: xup.id, object: xup.object,
number: xup.number, description: xup.description }]
END),
addresses: (CASE WHEN size(xas)= 0
THEN NULL
ELSE [xa IN xas | { id: xa.id, object: xa.object,
street: xa.street, locality: xa.locality, region: xa.region,
postcode: xa.postcode, country: xa.country, description: xa.description, neighborhood: xa.neighborhood }]
END),
useraddresses: (CASE WHEN size(xuas)= 0
THEN NULL
ELSE [xua IN xuas | { id: xua.id, object: xua.object,
street: xua.street, locality: xua.locality, region: xua.region,
postcode: xua.postcode, country: xua.country, description: xua.description, neighborhood: xua.neighborhood }]
END)
}) AS r
MATCH (t:team {id:"123"})
WITH t
MATCH (c:contact)-[:CONTACT]->(t)
WITH c AS x
RETURN COLLECT({
id: x.id,
object: x.object,
status: x.status,
teamid: x.teamid,
name: x.name,
displayname: x.displayname,
email: x.email,
imageurl: x.imageurl,
workhours: x.workhours,
notes: x.notes,
company: x.company
}) AS r
MATCH (u:user { id: "u1" })
OPTIONAL MATCH (u)-[:PHONE]->(xup:phone)
OPTIONAL MATCH (u)-[:ADDRESS]->(xua:address)
// cardinality 1
WITH u, collect(distinct xup) as phones, collect(distinct xua) as addresses
MATCH (u)-[:CONTACT]->(c:contact)
WITH u, c, phones, addresses
OPTIONAL MATCH (c)-[:CREATED]->(xca:activity)<-[:USERACTIVITY]-(xcc:contact)
WITH u,c, phones,addresses, collect(distinct xcc) as contact_activities
...
MATCH (u:user { id: "u1" })
OPTIONAL MATCH (u)-[:PHONE]->(xup:phone)
OPTIONAL MATCH (u)-[:ADDRESS]->(xua:address)
// cardinality 1
WITH u, {user:u, phones:collect(distinct xup), addresses: collect(distinct xua)} as user_info
MATCH (u)-[:CONTACT]->(c:contact)
WITH c, user_info
OPTIONAL MATCH (c)-[:CREATED]->(xca:activity)<-[:USERACTIVITY]-(xcc:contact)
WITH c, user_info, {activities: collect(distinct xcc)} as contact_info
...