titan,使用batchgraph加载数据
我想通过客户端应用程序将250万vertex加载到titan中。我已经格式化了txt文件。此文件的第一行 id:12345,公司名称:Abcd,国家:Abcd,。。。格式(propertyname:propertyvalue,…) 我尝试从我的客户端应用程序中使用Rexter将100行样本加载到titan中,并成功 对于250万行,我认为使用BatchGraph是最好的方法。对于测试,只需获取第一行并保存为test.txt 已成功编译并运行此代码titan,使用batchgraph加载数据,graph,titan,bulk-load,Graph,Titan,Bulk Load,我想通过客户端应用程序将250万vertex加载到titan中。我已经格式化了txt文件。此文件的第一行 id:12345,公司名称:Abcd,国家:Abcd,。。。格式(propertyname:propertyvalue,…) 我尝试从我的客户端应用程序中使用Rexter将100行样本加载到titan中,并成功 对于250万行,我认为使用BatchGraph是最好的方法。对于测试,只需获取第一行并保存为test.txt 已成功编译并运行此代码 BaseConfigur
BaseConfiguration config = new BaseConfiguration();
config.setProperty("storage.backend", "inmemory");
config.setProperty("storage.hostname", "192.168.200.141");
config.setProperty("storage.port", "8182");
config.setProperty("storage.batch-loading", "true");
TitanGraph graph = null;
graph = TitanFactory.open(config);
BatchGraph bg = new BatchGraph(graph, VertexIDType.NUMBER, 1000);
Vertex currentNode = null;
String path = "c:\\test.txt";
Charset encoding = Charset.forName("ISO-8859-1");
List<String> lines = null;
try {
lines = Files.readAllLines(Paths.get(path), encoding);
} catch (IOException e) {
e.printStackTrace();
}
for (String line : lines) {
currentNode = bg.addVertex(1);
String[] values = line.split(",");
for (String value : values) {
String[] property = value.split(":");
currentNode.setProperty(property[0].toString(), property[1].toString());
}
bg.commit();
}
我已经通过gremlin设置了属性键和复合索引
mgmt = g.getManagementSystem()
id = mgmt.makePropertyKey('id').dataType(Integer.class).make()
companyname = mgmt.makePropertyKey('companyname').dataType(String.class).make()
country = mgmt.makePropertyKey('country').dataType(String.class).make()
mgmt.buildIndex('ni_id',Vertex.class).addKey(id).buildCompositeIndex()
mgmt.buildIndex('ni_companynamecountry',Vertex.class).addKey(companyname).addKey(country).buildCompositeIndex()
mgmt.buildIndex('ni_companyname',Vertex.class).addKey(companyname).buildCompositeIndex()
mgmt.buildIndex('ni_country',Vertex.class).addKey(country).buildCompositeIndex()
mgmt.commit()
g.getIndexedKeys(Vertex.class)
==>id
==>companyname
==>country
使用cassandra后端()通过gremlin从txt成功加载。但仍然需要从我的应用程序中完成。我变了;
config.setProperty(“storage.backend”、“inmemory”);
到
config.setProperty(“storage.backend”、“cassandra”)
但是当打开连接(graph=TitanFactory.open(config);)时会出现此错误
java.lang.IllegalArgumentException: Property Key with given name does not exist: id
at com.thinkaurelius.titan.graphdb.types.typemaker.DisableDefaultSchemaMaker.makePropertyKey(DisableDefaultSchemaMaker.java:27)
at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.getOrCreatePropertyKey(StandardTitanTx.java:902)
at com.thinkaurelius.titan.graphdb.vertices.AbstractVertex.setProperty(AbstractVertex.java:239)
at com.tinkerpop.blueprints.util.wrappers.batch.BatchGraph$BatchVertex.setProperty(BatchGraph.java:492)
at tr.com.titanbulk.TitanBulk$5.widgetSelected(TitanBulk.java:213)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
18:26:15.503 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - About to instantiate class public com.netflix.astyanax.connectionpool.impl.FixedRetryBackoffStrategy(int,int) with 2 arguments
18:26:15.509 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Instantiated RetryBackoffStrategy object com.netflix.astyanax.connectionpool.impl.FixedRetryBackoffStrategy@52e6fdee from config string "com.netflix.astyanax.connectionpool.impl.FixedRetryBackoffStrategy,1000,5000"
18:26:15.511 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - About to instantiate class public com.netflix.astyanax.retry.BoundedExponentialBackoff(long,long,int) with 3 arguments
18:26:15.512 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Instantiated RetryPolicy object com.netflix.astyanax.retry.BoundedExponentialBackoff@7ec7ffd3[maxSleepTimeMs=25000,MAX_SHIFT=30,random=java.util.Random@dd8ba08,baseSleepTimeMs=100,maxAttempts=8,attempts=0] from config string "com.netflix.astyanax.retry.BoundedExponentialBackoff,100,25000,8"
18:26:15.530 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Custom RetryBackoffStrategy com.netflix.astyanax.connectionpool.impl.FixedRetryBackoffStrategy@52e6fdee
18:26:15.810 [main] INFO c.n.a.c.i.ConnectionPoolMBeanManager - Registering mbean: com.netflix.MonitoredResources:type=ASTYANAX,name=ClusterTitanConnectionPool,ServiceType=connectionpool
18:26:15.823 [main] INFO c.n.a.c.i.CountingConnectionPoolMonitor - AddHost: 192.168.200.141
18:26:16.851 [pool-4-thread-1] DEBUG c.n.astyanax.thrift.ThriftConverter - java.net.ConnectException: Connection refused: connect
18:26:25.832 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Failed to describe keyspace titan
18:26:25.832 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Creating keyspace titan...
18:26:26.853 [pool-4-thread-1] DEBUG c.n.astyanax.thrift.ThriftConverter - java.net.ConnectException: Connection refused: connect
18:26:35.848 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Failed to create keyspace titan
java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager
at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:55)
at com.thinkaurelius.titan.diskstorage.Backend.getImplementationClass(Backend.java:421)
at com.thinkaurelius.titan.diskstorage.Backend.getStorageManager(Backend.java:361)
at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1275)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:93)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:73)
at tr.com.kale.titanbulk.TitanBulk$5.widgetSelected(TitanBulk.java:196)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at tr.com.kale.titanbulk.TitanBulk.open(TitanBulk.java:68)
at tr.com.kale.titanbulk.TitanBulk.main(TitanBulk.java:52)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:44)
... 13 more
Caused by: com.thinkaurelius.titan.diskstorage.TemporaryBackendException: Temporary failure in storage backend
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(AstyanaxStoreManager.java:563)
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager.<init>(AstyanaxStoreManager.java:283)
... 18 more
Caused by: com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=192.168.200.141(192.168.200.141):9160, latency=10002(10002), attempts=1]Timed out waiting for connection
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:231)
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:198)
at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:84)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:117)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:338)
at com.netflix.astyanax.thrift.ThriftClusterImpl.executeSchemaChangeOperation(ThriftClusterImpl.java:146)
at com.netflix.astyanax.thrift.ThriftClusterImpl.internalCreateKeyspace(ThriftClusterImpl.java:321)
at com.netflix.astyanax.thrift.ThriftClusterImpl.addKeyspace(ThriftClusterImpl.java:294)
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(AstyanaxStoreManager.java:558)
... 19 more
java.lang.IllegalArgumentException: Graph may not be null
at com.tinkerpop.blueprints.util.wrappers.batch.BatchGraph.<init>(BatchGraph.java:81)
at tr.com.kale.titanbulk.TitanBulk$5.widgetSelected(TitanBulk.java:206)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at tr.com.kale.titanbulk.TitanBulk.open(TitanBulk.java:68)
at tr.com.kale.titanbulk.TitanBulk.main(TitanBulk.java:52)
18:26:15.503[main]调试c.t.t.d.c.a.AstyanaxStoreManager-即将用2个参数实例化类public com.netflix.astyanax.connectionpool.impl.FixedRetryBackoffStrategy(int,int)
18:26:15.509[main]调试c.t.t.d.c.a.AstyanaxStoreManager-实例化的RetryBackoffStrategy对象com.netflix.astyanax.connectionpool.impl。FixedRetryBackoffStrategy@52e6fdee来自配置字符串“com.netflix.astyanax.connectionpool.impl.FixedRetryBackoffStrategy,10005000”
18:26:15.511[main]DEBUG c.t.t.d.c.a.AstyanaxStoreManager-即将用3个参数实例化类public com.netflix.astyanax.retry.BoundedExponentialBackoff(long,long,int)
18:26:15.512[main]调试c.t.t.d.c.a.AstyanaxStoreManager-实例化的RetryPolicy对象com.netflix.astyanax.retry。BoundedExponentialBackoff@7ec7ffd3[maxSleepTimeMs=25000,MAX_SHIFT=30,random=java.util。Random@dd8ba08,baseSleepTimeMs=100,maxAttempts=8,attempts=0]来自配置字符串“com.netflix.astyanax.retry.BoundedExponentialBackoff,10025000,8”
18:26:15.530[main]调试c.t.t.d.c.a.AstyanaxStoreManager-自定义RetryBackoffStrategy com.netflix.astyanax.connectionpool.impl。FixedRetryBackoffStrategy@52e6fdee
18:26:15.810[main]INFO c.n.a.c.i.ConnectionPoolMBeanManager-注册mbean:com.netflix.MonitoredResources:type=ASTYANAX,name=ClusterTitanConnectionPool,ServiceType=connectionpool
18:26:15.823[主]信息c.n.a.c.i.CountingConnectionPoolMonitor-添加主机:192.168.200.141
18:26:16.851[pool-4-thread-1]调试c.n.astyanax.thrift.ThriftConverter-java.net.ConnectException:连接被拒绝:连接
18:26:25.832[main]调试c.t.t.d.c.a.AstyanaxStoreManager-无法描述键空间titan
18:26:25.832[main]调试c.t.t.d.c.a.AstyanaxStoreManager-创建密钥空间titan。。。
18:26:26.853[pool-4-thread-1]调试c.n.astyanax.thrift.ThriftConverter-java.net.ConnectException:连接被拒绝:连接
18:26:35.848[main]调试c.t.t.d.c.a.AstyanaxStoreManager-未能创建键空间titan
java.lang.IllegalArgumentException:无法实例化实现:com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager
位于com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:55)
位于com.thinkaurelius.titan.diskstorage.Backend.getImplementationClass(Backend.java:421)
位于com.thinkaurelius.titan.diskstorage.Backend.getStorageManager(Backend.java:361)
位于com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration。(GraphDatabaseConfiguration.java:1275)
位于com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:93)
位于com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:73)
在tr.com.kale.titanbulk.titanbulk$5.widgetSelected(titanbulk.java:196)
位于org.eclipse.swt.widgets.TypedListener.handleEvent(未知源)
位于org.eclipse.swt.widgets.EventTable.sendEvent(未知源)
位于org.eclipse.swt.widgets.Widget.sendEvent(未知源)
位于org.eclipse.swt.widgets.Display.runDeferredEvents(未知源)
位于org.eclipse.swt.widgets.Display.readAndDispatch(未知源)
位于tr.com.kale.titanbulk.titanbulk.open(titanbulk.java:68)
位于tr.com.kale.titanbulk.titanbulk.main(titanbulk.java:52)
原因:java.lang.reflect.InvocationTargetException
位于sun.reflect.NativeConstructorAccessorImpl.newInstance0(本机方法)
位于sun.reflect.NativeConstructorAccessorImpl.newInstance(未知源)
位于sun.reflect.delegatingConstructor或AccessorImpl.newInstance(未知源)
位于java.lang.reflect.Constructor.newInstance(未知源)
位于com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:44)
…还有13个
原因:com.thinkaurelius.titan.diskstorage.TemporaryBackendException:存储后端出现临时故障
位于com.thinkaurelius.titan.diskstorage.cassandra.astyanax.astyanaxstoremanar.ensureKeyspaceExists(astyanaxstoremanar.java:563)
位于com.thinkaurelius.titan.diskstorage.cassandra.astyanax.astyanaxstoremanar.(astyanaxstoremanar.java:283)
…还有18个
原因:com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException:PoolTimeoutException:[主机=192.168.200.141(192.168.200.141):9160,延迟=10002(10002),尝试次数=1]等待连接超时
位于com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:231)
位于com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:198)
位于com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:84)
位于com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryoOperation(AbstractExecuteWithFailoverImpl.java:117)
位于com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:338)
位于com.netflix.astyanax.thrift.ThriftClusterImpl.executeSchemaChangeOperation(ThriftClusterImpl.java:146)
位于com.netflix.astyanax.thrift.ThriftClusterImpl.internalCreateKeyspace(ThriftClusterImpl.java:321)
在com.netflix
18:35:18.296 [main] DEBUG c.t.t.d.c.t.t.CTConnectionFactory - Creating TSocket(192.168.200.141, 9160, null, null, 10000)
java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager
at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:55)
at com.thinkaurelius.titan.diskstorage.Backend.getImplementationClass(Backend.java:421)
at com.thinkaurelius.titan.diskstorage.Backend.getStorageManager(Backend.java:361)
at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1275)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:93)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:73)
at tr.com.kale.titanbulk.TitanBulk$5.widgetSelected(TitanBulk.java:196)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at tr.com.kale.titanbulk.TitanBulk.open(TitanBulk.java:68)
at tr.com.kale.titanbulk.TitanBulk.main(TitanBulk.java:52)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:44)
... 13 more
Caused by: com.thinkaurelius.titan.diskstorage.TemporaryBackendException: Temporary failure in storage backend
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager.getCassandraPartitioner(CassandraThriftStoreManager.java:218)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager.<init>(CassandraThriftStoreManager.java:196)
... 18 more
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeRawConnection(CTConnectionFactory.java:88)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeObject(CTConnectionFactory.java:52)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeObject(CTConnectionFactory.java:21)
at org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:1220)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager.getCassandraPartitioner(CassandraThriftStoreManager.java:215)
... 19 more
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 25 more
@Test
public void bulkLoad(){
BaseConfiguration config = new BaseConfiguration();
config.setProperty("storage.backend", "inmemory");
config.setProperty("storage.batch-loading", "true");
TitanGraph graph = TitanFactory.open(config);
TitanManagement mng = graph.getManagementSystem();
if (mng.getPropertyKey("prop") == null) {
PropertyKey pk = mng.makePropertyKey("prop").dataType(String.class).make();
mng.buildIndex("prop_index", Vertex.class).addKey(pk).buildCompositeIndex();
}
mng.commit();
BatchGraph bg = new BatchGraph(graph, VertexIDType.STRING, 1000);
System.out.println("Start bulk loading");
IntStream.range(1,1000).forEach(i -> {
Vertex v = bg.addVertex("id"+i);
v.setProperty("prop", "prop"+i);
});
bg.commit();
assertNotNull(bg.getVertex("id10"));
assertEquals("prop10",bg.getVertex("id10").getProperty("prop"));
}