NEO4J Spatial:有关批处理插入器的提示
这是我的场景:我们正在使用neo4j和spatial插件构建一个路由系统。我们从OSM文件开始,读取该文件并在图(自定义图模型)中导入节点和关系 现在,如果我们不使用neo4j的批插入器,为了导入压缩的OSM文件(压缩尺寸约为140MB,正常尺寸约为2GB),在具有以下特征的专用服务器上大约需要3天:CentOS 6.5 64位,四核,8GB RAM;请注意,大部分时间与Neo4J节点和关系创建有关;事实上,如果我们在不使用neo4j的情况下读取同一个文件,该文件将在大约7分钟内读取(我确信这一点,因为在我们的过程中,我们首先读取该文件以存储正确的osm节点ID,然后再次读取该文件以创建neo4j图形) 显然,我们需要改进导入过程,因此我们正在尝试使用batchInserter。到目前为止,一切都很好(我需要通过使用batchInserter检查它的性能,但我想它会更快);所以我做的第一件事是:让我们尝试在一个简单的测试用例中使用批插入器(与我们的代码非常相似,但不直接修改代码) 我列出了我的软件版本:NEO4J Spatial:有关批处理插入器的提示,neo4j,Neo4j,这是我的场景:我们正在使用neo4j和spatial插件构建一个路由系统。我们从OSM文件开始,读取该文件并在图(自定义图模型)中导入节点和关系 现在,如果我们不使用neo4j的批插入器,为了导入压缩的OSM文件(压缩尺寸约为140MB,正常尺寸约为2GB),在具有以下特征的专用服务器上大约需要3天:CentOS 6.5 64位,四核,8GB RAM;请注意,大部分时间与Neo4J节点和关系创建有关;事实上,如果我们在不使用neo4j的情况下读取同一个文件,该文件将在大约7分钟内读取(我确信这一
- Neo4j:2.0.2
- neo4j空间:0.13-neo4j-2.0.1
- Neo4jGraphCollections:0.7.1-neo4j-2.0.1
- 渗透系数:0.43.1
public class BatchInserterSinkTest implements Sink
{
public static final Map<String, String> NEO4J_CFG = new HashMap<String, String>();
private static File basePath = new File("/home/angelo/Scrivania/neo4j");
private static File dbPath = new File(basePath, "db");
private GraphDatabaseService graphDb;
private BatchInserter batchInserter;
// private BatchInserterIndexProvider batchIndexService;
private SpatialDatabaseService spatialDb;
private SimplePointLayer spl;
static
{
NEO4J_CFG.put( "neostore.nodestore.db.mapped_memory", "100M" );
NEO4J_CFG.put( "neostore.relationshipstore.db.mapped_memory", "300M" );
NEO4J_CFG.put( "neostore.propertystore.db.mapped_memory", "400M" );
NEO4J_CFG.put( "neostore.propertystore.db.strings.mapped_memory", "800M" );
NEO4J_CFG.put( "neostore.propertystore.db.arrays.mapped_memory", "10M" );
NEO4J_CFG.put( "dump_configuration", "true" );
}
@Override
public void initialize(Map<String, Object> arg0)
{
batchInserter = BatchInserters.inserter(dbPath.getAbsolutePath(), NEO4J_CFG);
graphDb = new SpatialBatchGraphDatabaseService(batchInserter);
spatialDb = new SpatialDatabaseService(graphDb);
spl = spatialDb.createSimplePointLayer("testBatch", "latitudine", "longitudine");
//batchIndexService = new LuceneBatchInserterIndexProvider(batchInserter);
}
@Override
public void complete()
{
// TODO Auto-generated method stub
}
@Override
public void release()
{
// TODO Auto-generated method stub
}
@Override
public void process(EntityContainer ec)
{
Entity entity = ec.getEntity();
if (entity instanceof Node) {
Node osmNodo = (Node)entity;
org.neo4j.graphdb.Node graphNode = graphDb.createNode();
graphNode.setProperty("osmId", osmNodo.getId());
graphNode.setProperty("latitudine", osmNodo.getLatitude());
graphNode.setProperty("longitudine", osmNodo.getLongitude());
spl.add(graphNode);
} else if (entity instanceof Way) {
//do something with the way
} else if (entity instanceof Relation) {
//do something with the relation
}
}
}
通过执行此代码,我得到以下异常:
Exception in thread "Thread-1" java.lang.ClassCastException: org.neo4j.unsafe.batchinsert.SpatialBatchGraphDatabaseService cannot be cast to org.neo4j.kernel.GraphDatabaseAPI
at org.neo4j.cypher.ExecutionEngine.<init>(ExecutionEngine.scala:113)
at org.neo4j.cypher.javacompat.ExecutionEngine.<init>(ExecutionEngine.java:53)
at org.neo4j.cypher.javacompat.ExecutionEngine.<init>(ExecutionEngine.java:43)
at org.neo4j.collections.graphdb.ReferenceNodes.getReferenceNode(ReferenceNodes.java:60)
at org.neo4j.gis.spatial.SpatialDatabaseService.getSpatialRoot(SpatialDatabaseService.java:76)
at org.neo4j.gis.spatial.SpatialDatabaseService.getLayer(SpatialDatabaseService.java:108)
at org.neo4j.gis.spatial.SpatialDatabaseService.containsLayer(SpatialDatabaseService.java:253)
at org.neo4j.gis.spatial.SpatialDatabaseService.createLayer(SpatialDatabaseService.java:282)
at org.neo4j.gis.spatial.SpatialDatabaseService.createSimplePointLayer(SpatialDatabaseService.java:266)
at it.eng.pinf.graph.batch.test.BatchInserterSinkTest.initialize(BatchInserterSinkTest.java:46)
at org.openstreetmap.osmosis.xml.v0_6.XmlReader.run(XmlReader.java:95)
at java.lang.Thread.run(Thread.java:744)
所以现在我想知道:如何在我的案例中使用batchInserter?我必须将创建的节点添加到SimplePointLayer…那么如何使用batchInserter graph db服务创建它呢?
有简单的小样品吗
任何提示都非常感谢
干杯
Angelo代码中的OSM导入器类有一个使用批插入器导入OSM数据的示例。主要的是,neo4j spatial并不真正支持批处理插入器,因此您需要手动执行一些操作。如果您查看类OSMImporter.OSMBatchWriter,您将看到它是如何工作的。它根本不使用SimplePointLayer,因为它不支持批处理插入器。它正在直接创建所需的图形结构。简单点层非常简单,当然比我引用的代码创建的OSM模型简单得多,因此我认为您应该能够自己编写与批插入器兼容的版本,而不会有太多麻烦 我建议您使用批插入器创建层和节点,以创建正确的图形结构,然后切换到普通的嵌入式API,并使用该API迭代节点并将它们添加到空间索引中
Exception in thread "Thread-1" java.lang.ClassCastException: org.neo4j.unsafe.batchinsert.SpatialBatchGraphDatabaseService cannot be cast to org.neo4j.kernel.GraphDatabaseAPI
at org.neo4j.cypher.ExecutionEngine.<init>(ExecutionEngine.scala:113)
at org.neo4j.cypher.javacompat.ExecutionEngine.<init>(ExecutionEngine.java:53)
at org.neo4j.cypher.javacompat.ExecutionEngine.<init>(ExecutionEngine.java:43)
at org.neo4j.collections.graphdb.ReferenceNodes.getReferenceNode(ReferenceNodes.java:60)
at org.neo4j.gis.spatial.SpatialDatabaseService.getSpatialRoot(SpatialDatabaseService.java:76)
at org.neo4j.gis.spatial.SpatialDatabaseService.getLayer(SpatialDatabaseService.java:108)
at org.neo4j.gis.spatial.SpatialDatabaseService.containsLayer(SpatialDatabaseService.java:253)
at org.neo4j.gis.spatial.SpatialDatabaseService.createLayer(SpatialDatabaseService.java:282)
at org.neo4j.gis.spatial.SpatialDatabaseService.createSimplePointLayer(SpatialDatabaseService.java:266)
at it.eng.pinf.graph.batch.test.BatchInserterSinkTest.initialize(BatchInserterSinkTest.java:46)
at org.openstreetmap.osmosis.xml.v0_6.XmlReader.run(XmlReader.java:95)
at java.lang.Thread.run(Thread.java:744)
spl = spatialDb.createSimplePointLayer("testBatch", "latitudine", "longitudine");