Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Titan Hadoop小精灵配置问题_Hadoop_Apache Spark_Titan_Gremlin - Fatal编程技术网

Titan Hadoop小精灵配置问题

Titan Hadoop小精灵配置问题,hadoop,apache-spark,titan,gremlin,Hadoop,Apache Spark,Titan,Gremlin,我试图通过遵循Titan文档,让Titan与Tinkerpop 3.0.1 Hadoop Gremlin一起工作。我基本上是从下载页面下载titan-1.0.0-hadoop1 我几乎完全按照文档进行操作,唯一的区别是我使用的是HBase支持的图形,而不是文档中使用的Cassandra图形。我确信我的“titan hbase cluster.properties”文件没有错误,可以在hbase中读/写而不会出现任何问题 因此,根据Titan文档,我在gremlin控制台中发出以下命令:

我试图通过遵循Titan文档,让Titan与Tinkerpop 3.0.1 Hadoop Gremlin一起工作。我基本上是从下载页面下载titan-1.0.0-hadoop1

我几乎完全按照文档进行操作,唯一的区别是我使用的是HBase支持的图形,而不是文档中使用的Cassandra图形。我确信我的“titan hbase cluster.properties”文件没有错误,可以在hbase中读/写而不会出现任何问题

因此,根据Titan文档,我在gremlin控制台中发出以下命令:

         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: aurelius.titan
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/src/titanTest/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/src/titanTest/lib/logback-classic-1.1.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
23:17:17 INFO  org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph  - HADOOP_GREMLIN_LIBS is set to: /usr/src/titanTest/lib
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.tinkergraph
gremlin> :load data/grateful-dead-titan-schema.groovy
==>true
==>true
gremlin> graph = TitanFactory.open('conf/titan-hbase-cluster.properties')
==>standardtitangraph[cassandrathrift:[127.0.0.1]]
 gremlin> defineGratefulDeadSchema(graph)
==>null
gremlin> graph.close()
==>null
gremlin> hdfs.copyFromLocal('data/grateful-dead.kryo','data/grateful-dead.kryo')
23:22:46 WARN  org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
==>null
gremlin> graph = GraphFactory.open('conf/hadoop-graph/hadoop-load.properties')
==>hadoopgraph[gryoinputformat->nulloutputformat]
gremlin> blvp = BulkLoaderVertexProgram.build().writeGraph('conf/titan-cassandra.properties').create(graph)
==>BulkLoaderVertexProgram[bulkLoader=IncrementalBulkLoader,vertexIdProperty=bulkLoader.vertex.id,userSuppliedIds=false,keepOriginalIds=true,batchSize=0]
gremlin> graph.compute(SparkGraphComputer).program(blvp).submit().get()
运行这些命令后,我得到以下警告和错误字符串

23:23:51 WARN  org.apache.tinkerpop.gremlin.hadoop.process.computer.spark.SparkGraphComputer  - class org.apache.hadoop.mapreduce.lib.output.NullOutputFormat does not implement PersistResultGraphAware and thus, persistence options are unknown -- assuming all options are possible
23:23:58 WARN  org.apache.hadoop.io.compress.snappy.LoadSnappy  - Snappy native library not loaded
23:23:58 ERROR org.apache.spark.executor.Executor  - Exception in task 0.0 in stage 0.0 (TID 0)
java.io.EOFException
        at java.io.DataInputStream.readByte(DataInputStream.java:267)
        at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.seekToHeader(GryoRecordReader.java:82)
        at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.initialize(GryoRecordReader.java:74)
        at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat.createRecordReader(GryoInputFormat.java:39)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:133)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:107)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
        at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
23:23:58 WARN  org.apache.spark.scheduler.TaskSetManager  - Lost task 0.0 in stage 0.0 (TID 0, localhost): java.io.EOFException
        at java.io.DataInputStream.readByte(DataInputStream.java:267)
        at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.seekToHeader(GryoRecordReader.java:82)
        at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.initialize(GryoRecordReader.java:74)
        at org.apache.tinkerpop.gremlin.hadoop.structure.ioofasdsadsa dsadsadsa dsa dsa dsadsa.gryo.GryoInputFormat.createRecordReader(GryoInputFormat.java:39)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:133)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:107)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
        at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

23:23:58 ERROR org.apache.spark.scheduler.TaskSetManager  - Task 0 in stage 0.0 failed 1 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent fai.java:82)
        at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.initialize(GryoRecordReader.java:74)
        at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat.createRecordReader(GryoInputFormat.java:39)
        at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:133)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:107)
        at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
        at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
23:23:51警告org.apache.tinkerpop.gremlin.hadoop.process.computer.SparkGraphComputer-class org.apache.hadoop.mapreduce.lib.output.NullOutputFormat未实现PersistResultGraphAware,因此,持久性选项未知——假设所有选项都是可能的
23:23:58警告org.apache.hadoop.io.compress.snapy.LoadSnappy-未加载snappy本机库
23:23:58错误org.apache.spark.executor.executor-阶段0.0(TID 0)中任务0.0中的异常
java.io.EOFException
位于java.io.DataInputStream.readByte(DataInputStream.java:267)
位于org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.seekToHeader(GryoRecordReader.java:82)
位于org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.initialize(GryoRecordReader.java:74)
位于org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat.createRecordReader(GryoInputFormat.java:39)
位于org.apache.spark.rdd.NewHadoopRDD$$anon$1。(NewHadoopRDD.scala:133)
位于org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:107)
位于org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:280)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:247)
在org.apache.spark.rdd.mapperdd.compute上(mapperdd.scala:31)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:280)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:247)
在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)上
在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)上
位于org.apache.spark.scheduler.Task.run(Task.scala:56)
位于org.apache.spark.executor.executor$TaskRunner.run(executor.scala:200)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
运行(Thread.java:745)
23:23:58 WARN org.apache.spark.scheduler.TaskSetManager-在阶段0.0中丢失任务0.0(TID 0,localhost):java.io.EOFException
位于java.io.DataInputStream.readByte(DataInputStream.java:267)
位于org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.seekToHeader(GryoRecordReader.java:82)
位于org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.initialize(GryoRecordReader.java:74)
位于org.apache.tinkerpop.gremlin.hadoop.structure.ioofasdassadsadsa dsadsa.gryo.GryoInputFormat.createRecordReader(GryoInputFormat.java:39)
位于org.apache.spark.rdd.NewHadoopRDD$$anon$1。(NewHadoopRDD.scala:133)
位于org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:107)
位于org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:280)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:247)
在org.apache.spark.rdd.mapperdd.compute上(mapperdd.scala:31)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:280)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:247)
在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)上
在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)上
位于org.apache.spark.scheduler.Task.run(Task.scala:56)
位于org.apache.spark.executor.executor$TaskRunner.run(executor.scala:200)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
运行(Thread.java:745)
23:23:58错误org.apache.spark.scheduler.TaskSetManager-阶段0.0中的任务0失败1次;中止工作
org.apache.spark.sparkeexception:作业因阶段失败而中止:阶段0.0中的任务0失败1次,最近的fai.java:82)
位于org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.initialize(GryoRecordReader.java:74)
位于org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat.createRecordReader(GryoInputFormat.java:39)
位于org.apache.spark.rdd.NewHadoopRDD$$anon$1。(NewHadoopRDD.scala:133)
位于org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:107)
位于org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:280)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:247)
在org.apache.spark.rdd.mapperdd.compute上(mapperdd.scala:31)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:280)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:247)
在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)上
在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)上
位于org.apache.spark.scheduler.Task.run(Task.scala:56)
位于org.apache.spark.executor.executor$TaskRunner.run(executor.scala:200)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
在java.util.concurren