如何在不使用javaRDD的情况下通过dataframe从hbase获取数据
如何使用dataframe(spark sql)而不使用javaRDD从Hbase获取数据。 代码:-如何在不使用javaRDD的情况下通过dataframe从hbase获取数据,java,Java,如何使用dataframe(spark sql)而不使用javaRDD从Hbase获取数据。 代码:- SparkConf sconf=new SparkConf().setMaster(“本地”).setAppName(“测试”); Configuration=HBaseConfiguration.create(); JavaSparkContext jsc=新的JavaSparkContext(sconf); 试一试{ HBaseAdmin.checkHBaseAvailable(conf)
SparkConf sconf=new SparkConf().setMaster(“本地”).setAppName(“测试”);
Configuration=HBaseConfiguration.create();
JavaSparkContext jsc=新的JavaSparkContext(sconf);
试一试{
HBaseAdmin.checkHBaseAvailable(conf);
System.out.println(“HBase正在运行”);
}捕获(例外e){
System.out.println(“HBase未运行”);
e、 printStackTrace();
}
SQLContext SQLContext=新的SQLContext(jsc);
String sqlMapping=“行字符串:行,城市字符串r:城市”;
HashMap=newHashMap();
map.put(“hbase.columns.mapping”,sqlMapping);
map.put(“hbase.table”、“emp1”);
DataFrame dataFrame1=sqlContext.read().format(“org.apache.hadoop.hbase.spark”).options(map.load();
例外情况:-
线程“main”java.lang.IllegalArgumentException中出现异常:hbase.columns.mapping“行字符串:行,城市字符串r:city”的值无效
位于org.apache.hadoop.hbase.spark.DefaultSource.GenerateSchemappingMap(DefaultSource.scala:119)
位于org.apache.hadoop.hbase.spark.DefaultSource.createRelation(DefaultSource.scala:79)
位于org.apache.spark.sql.execution.datasources.resolvedatasource$.apply(resolvedatasource.scala:158)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
位于dataframe.ParquetExample.main(ParquetExample.java:94)
原因:java.lang.IllegalArgumentException:不支持的列类型:String我已解决此异常“不支持的列类型:String”,但
i have solve this exception "Unsupported column type :String" but
now getting another issue.
Exception in thread "main" java.lang.NullPointerException
at org.apache.hadoop.hbase.spark.HBaseRelation.<init>(DefaultSource.scala:175)
at org.apache.hadoop.hbase.spark.DefaultSource.createRelation(DefaultSource.scala:78)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1140)
现在又有问题了。
线程“main”java.lang.NullPointerException中出现异常
位于org.apache.hadoop.hbase.spark.HBaseRelation.(DefaultSource.scala:175)
位于org.apache.hadoop.hbase.spark.DefaultSource.createRelation(DefaultSource.scala:78)
位于org.apache.spark.sql.execution.datasources.resolvedatasource$.apply(resolvedatasource.scala:158)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
位于org.apache.spark.sql.SQLContext.load(SQLContext.scala:1140)
i have solve this exception "Unsupported column type :String" but
now getting another issue.
Exception in thread "main" java.lang.NullPointerException
at org.apache.hadoop.hbase.spark.HBaseRelation.<init>(DefaultSource.scala:175)
at org.apache.hadoop.hbase.spark.DefaultSource.createRelation(DefaultSource.scala:78)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1140)