Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/amazon-s3/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 无法从Spark测试S3支持的Hbase_Apache Spark_Amazon S3_Hbase_Amazon Emr - Fatal编程技术网

Apache spark 无法从Spark测试S3支持的Hbase

Apache spark 无法从Spark测试S3支持的Hbase,apache-spark,amazon-s3,hbase,amazon-emr,Apache Spark,Amazon S3,Hbase,Amazon Emr,我编写了一个从HBase读取数据的简单程序,该程序可以在由HDFS支持的Cloudera中找到 但在使用S3测试EMR数据时出现异常 // Spark conf SparkConf sparkConf = new SparkConf().setMaster("local[4]").setAppName("My App"); JavaSparkContext jsc = new JavaSparkContext(sparkConf); // Hbas

我编写了一个从HBase读取数据的简单程序,该程序可以在由HDFS支持的Cloudera中找到

但在使用S3测试EMR数据时出现异常

// Spark conf
        SparkConf sparkConf = new SparkConf().setMaster("local[4]").setAppName("My App");
        JavaSparkContext jsc = new JavaSparkContext(sparkConf);
        // Hbase conf
        Configuration conf = HBaseConfiguration.create();
        conf.set("hbase.zookeeper.quorum","localhost");
        conf.set("hbase.zookeeper.property.client.port","2181");
        // Submit scan into hbase conf
 //       conf.set(TableInputFormat.SCAN, TableMapReduceUtil.convertScanToString(scan));

        conf.set(TableInputFormat.INPUT_TABLE, "mytable");
        conf.set(TableInputFormat.SCAN_ROW_START, "startrow");
        conf.set(TableInputFormat.SCAN_ROW_STOP, "endrow");

        // Get RDD
        JavaPairRDD<ImmutableBytesWritable, Result> source = jsc
                .newAPIHadoopRDD(conf, TableInputFormat.class,
                        ImmutableBytesWritable.class, Result.class);

        // Process RDD
        System.out.println("&&&&&&&&&&&&&&&&&&&&&&& " + source.count());
记录任务完整日志中的行以了解更多详细信息。 位于org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:270) 位于org.apache.hadoop.hbase.mapreduce.TableInputFormat.getSplits(TableInputFormat.java:256) 位于org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:125) 位于org.apache.spark.rdd.rdd$$anonfun$partitions$2.apply(rdd.scala:252) 位于org.apache.spark.rdd.rdd$$anonfun$partitions$2.apply(rdd.scala:250) 位于scala.Option.getOrElse(Option.scala:121) 位于org.apache.spark.rdd.rdd.partitions(rdd.scala:250) 位于org.apache.spark.SparkContext.runJob(SparkContext.scala:2094) 位于org.apache.spark.rdd.rdd.count(rdd.scala:1158) 位于org.apache.spark.api.java.JavaRDDLike$class.count(JavaRDDLike.scala:455) 位于org.apache.spark.api.java.AbstractJavaRDDLike.count(JavaRDDLike.scala:45) 位于hbascan.main(hbascan.java:60) 在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处 位于sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中 位于java.lang.reflect.Method.invoke(Method.java:498) 位于org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775) 位于org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) 位于org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) 位于org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) 位于org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 原因:java.lang.IllegalStateException:输入格式实例未正确初始化。一定要打电话 initializeTable在构造函数或initialize方法中 位于org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getTable(TableInputFormatBase.java:652) 位于org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:265) ... 20更多关于所有APACHE HBASE库的信息:e.hadoop.HBASE.metrics.impl.MetricRegistriesImpl 18/05/04 04:05:54 错误TableInputFormat:java.io.IOException: java.lang.reflect.InvocationTargetException位于 org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240) 在 org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218) 在 org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119) 在 org.apache.hadoop.hbase.mapreduce.TableInputFormat.initialize(TableInputFormat.java:202) 在 org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:259) 在 org.apache.hadoop.hbase.mapreduce.TableInputFormat.getSplits(TableInputFormat.java:256) 在 org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:125) 在 org.apache.spark.rdd.rdd$$anonfun$partitions$2.apply(rdd.scala:252) 在 org.apache.spark.rdd.rdd$$anonfun$partitions$2.apply(rdd.scala:250) 位于scala.Option.getOrElse(Option.scala:121) org.apache.spark.rdd.rdd.partitions(rdd.scala:250)位于 org.apache.spark.SparkContext.runJob(SparkContext.scala:2094)位于 org.apache.spark.rdd.rdd.count(rdd.scala:1158)位于 org.apache.spark.api.java.JavaRDDLike$class.count(JavaRDDLike.scala:455) 在 org.apache.spark.api.java.AbstractJavaRDDLike.count(JavaRDDLike.scala:45) 位于hbascan.main(hbascan.java:60)处 sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)位于 invoke(NativeMethodAccessorImpl.java:62) 在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 位于java.lang.reflect.Method.invoke(Method.java:498) org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775) 在 org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) 位于org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) 位于org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) 在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)上 by:java.lang.reflect.InvocationTargetException位于 sun.reflect.NativeConstructorAccessorImpl.newInstance0(本机方法) 在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 位于java.lang.reflect.Constructor.newInstance(Constructor.java:423) 在 org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) ... 24其他原因:java.lang.RuntimeException:无法创建 接口org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSource是 类路径上的hadoop兼容性jar?在 org.apache.hadoop.hbase.CompatibilitySingletonFactory.getInstance(CompatibilitySingletonFactory.java:75) 在 org.apache.hadoop.hbase.zookeeper.MetricsZooKeeper.(MetricsZooKeeper.java:38) 在 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.(RecoverableZooKeeper.java:130) 位于org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:143) 在 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:181) 在 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:155) 在 org.apache.hadoop.hbase.client.ZooKeeperKeepAliveConnection.(ZooKeeperKeepAliveConnection.java:43) 在 org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveZo
Exception in thread "main" java.io.IOException: Cannot create a record reader because of a previous error. Please look at the previous