Indexing 索引属性的范围查询
查询索引属性范围时,范围大小是否有最大限制 为了澄清,我有一个以毫秒为单位的timestamp属性,它被编入索引,我正在尝试获取一个月内发生的所有事件。我有一个这样的问题Indexing 索引属性的范围查询,indexing,neo4j,range,Indexing,Neo4j,Range,查询索引属性范围时,范围大小是否有最大限制 为了澄清,我有一个以毫秒为单位的timestamp属性,它被编入索引,我正在尝试获取一个月内发生的所有事件。我有一个这样的问题 Match (e:Event)-[R:type{'has metadata'}]-> (S:EventMetaData) where e.type=~".*ELec.*" AND e.timestamp IN RANGE (1480550400000,1483228740000) return S.Location,
Match (e:Event)-[R:type{'has metadata'}]-> (S:EventMetaData) where e.type=~".*ELec.*" AND e.timestamp IN RANGE (1480550400000,1483228740000) return S.Location, sum(e.value) as sumV order by sumV DESC
但是我犯了以下错误
Exception in thread "main" java.lang.OutOfMemoryError: Cannot index an collection of size 2678340001
at org.neo4j.cypher.internal.compiler.v3_2.commands.expressions.IndexedInclusiveLongRange.length(IndexedInclusiveLongRange.scala:51)
at scala.collection.SeqLike$class.size(SeqLike.scala:106)
at org.neo4j.cypher.internal.compiler.v3_2.commands.expressions.IndexedInclusiveLongRange.size(IndexedInclusiveLongRange.scala:30)
at scala.collection.mutable.Builder$class.sizeHint(Builder.scala:69)
at scala.collection.mutable.SetBuilder.sizeHint(SetBuilder.scala:20)
at scala.collection.TraversableLike$class.to(TraversableLike.scala:589)
at org.neo4j.cypher.internal.compiler.v3_2.commands.expressions.IndexedInclusiveLongRange.to(IndexedInclusiveLongRange.scala:30)
at scala.collection.TraversableOnce$class.toSet(TraversableOnce.scala:304)
at org.neo4j.cypher.internal.compiler.v3_2.commands.expressions.IndexedInclusiveLongRange.toSet(IndexedInclusiveLongRange.scala:30)
at org.neo4j.cypher.internal.compiler.v3_2.commands.indexQuery$.apply(indexQuery.scala:46)
at org.neo4j.cypher.internal.compiler.v3_2.pipes.NodeIndexSeekPipe.internalCreateResults(NodeIndexSeekPipe.scala:48)
at org.neo4j.cypher.internal.compiler.v3_2.pipes.Pipe$class.createResults(Pipe.scala:51)
at org.neo4j.cypher.internal.compiler.v3_2.pipes.NodeIndexSeekPipe.createResults(NodeIndexSeekPipe.scala:29)
at org.neo4j.cypher.internal.compiler.v3_2.pipes.PipeWithSource.createResults(Pipe.scala:79)
at org.neo4j.cypher.internal.compiler.v3_2.pipes.PipeWithSource.createResults(Pipe.scala:79)
at org.neo4j.cypher.internal.compiler.v3_2.pipes.PipeWithSource.createResults(Pipe.scala:79)
at org.neo4j.cypher.internal.compiler.v3_2.pipes.PipeWithSource.createResults(Pipe.scala:79)
at org.neo4j.cypher.internal.compiler.v3_2.pipes.PipeWithSource.createResults(Pipe.scala:79)
at org.neo4j.cypher.internal.compiler.v3_2.pipes.PipeWithSource.createResults(Pipe.scala:79)
at org.neo4j.cypher.internal.compiler.v3_2.pipes.PipeWithSource.createResults(Pipe.scala:79)
at org.neo4j.cypher.internal.compiler.v3_2.executionplan.DefaultExecutionResultBuilderFactory$ExecutionWorkflowBuilder.createResults(DefaultExecutionResultBuilderFactory.scala:95)
at org.neo4j.cypher.internal.compiler.v3_2.executionplan.DefaultExecutionResultBuilderFactory$ExecutionWorkflowBuilder.build(DefaultExecutionResultBuilderFactory.scala:73)
at org.neo4j.cypher.internal.compiler.v3_2.BuildInterpretedExecutionPlan$$anonfun$getExecutionPlanFunction$1.apply(BuildInterpretedExecutionPlan.scala:99)
at org.neo4j.cypher.internal.compiler.v3_2.BuildInterpretedExecutionPlan$$anonfun$getExecutionPlanFunction$1.apply(BuildInterpretedExecutionPlan.scala:83)
at org.neo4j.cypher.internal.compiler.v3_2.BuildInterpretedExecutionPlan$$anon$1.run(BuildInterpretedExecutionPlan.scala:54)
at org.neo4j.cypher.internal.compatibility.v3_2.Compatibility$ExecutionPlanWrapper$$anonfun$run$1.apply(Compatibility.scala:96)
at org.neo4j.cypher.internal.compatibility.v3_2.Compatibility$ExecutionPlanWrapper$$anonfun$run$1.apply(Compatibility.scala:94)
at org.neo4j.cypher.internal.compatibility.v3_2.exceptionHandler$runSafely$.apply(exceptionHandler.scala:84)
at org.neo4j.cypher.internal.compatibility.v3_2.Compatibility$ExecutionPlanWrapper.run(Compatibility.scala:94)
neo4j试图在错误状态下分配大小为endRange startRange的集合,这有点奇怪。我知道我可以通过以小时/天为单位存储时间戳来解决这个问题,但我仍然想知道为什么在neo4j中索引属性上的范围查询性能很慢,以及是否存在允许的最大范围大小
p.S.I增加了neo4j堆和页面缓存大小,但在索引属性的范围查询方面仍然表现缓慢您试图使用一种非常低效的技术(即使有效)来测试范围,由于定义了
RANGE
函数来生成N+1
值的集合(其中N
是范围的上下限之差),因此IN
操作将对集合中的每个项进行比较(在最坏的情况下)
您应该稍微更改查询,以便每行仅进行两次数字比较:
MATCH (e:Event)-[R:type{'has metadata'}]-> (S:EventMetaData)
WHERE e.type=~".*ELec.*" AND 1480550400000 <= e.timestamp <= 1483228740000
RETURN S.Location, sum(e.value) AS sumV
ORDER BY sumV DESC;
MATCH(e:Event)-[R:type{'has metadata'}]->(S:EventMetaData)
其中e.type=~“*ELec.*”和148055040000您试图使用一种效率非常低的技术(即使有效)来测试范围,因为range
函数被定义为生成N+1
值的集合(其中N
是范围上下限之间的差值),而
中的操作将对集合中的每个项目进行比较(在最坏的情况下)
您应该稍微更改查询,以便每行仅进行两次数字比较:
MATCH (e:Event)-[R:type{'has metadata'}]-> (S:EventMetaData)
WHERE e.type=~".*ELec.*" AND 1480550400000 <= e.timestamp <= 1483228740000
RETURN S.Location, sum(e.value) AS sumV
ORDER BY sumV DESC;
MATCH(e:Event)-[R:type{'has metadata'}]->(S:EventMetaData)
其中e.type=~“*ELec.*”和148055040000谢谢!实际上,这是我查询的第一个版本,在使用Range之前,我在查询中使用了“”,是的,它有更好的性能。但问题是,我正试图研究索引对查询性能的影响,并使用“>”和“谢谢!事实上,这是我查询的第一个版本,我正在使用”“在我使用Range和yes之前的查询中,它具有更好的性能。但问题是,我试图研究索引对查询性能的影响,并使用“>”和