Apache spark 为什么Spark报告;java.net.URISyntaxException:绝对URI中的相对路径;在使用数据帧时?
我正在Windows计算机上本地运行Spark。我成功地启动了spark shell,并将文本文件作为RDD读取。我还能够跟随关于这个主题的各种在线教程,并且能够在RDD上执行各种操作 然而,当我试图将RDD转换为数据帧时,我得到了一个错误。这就是我正在做的:Apache spark 为什么Spark报告;java.net.URISyntaxException:绝对URI中的相对路径;在使用数据帧时?,apache-spark,dataframe,apache-spark-sql,Apache Spark,Dataframe,Apache Spark Sql,我正在Windows计算机上本地运行Spark。我成功地启动了spark shell,并将文本文件作为RDD读取。我还能够跟随关于这个主题的各种在线教程,并且能够在RDD上执行各种操作 然而,当我试图将RDD转换为数据帧时,我得到了一个错误。这就是我正在做的: val sqlContext = new org.apache.spark.sql.SQLContext(sc) import sqlContext.implicits._ //convert rdd to df val df = rd
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._
//convert rdd to df
val df = rddFile.toDF()
此代码生成一长串错误消息,这些消息似乎与以下消息有关:
Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:C:/Users/spark/spark-warehouse
at org.apache.hadoop.fs.Path.initialize(Path.java:205)
at org.apache.hadoop.fs.Path.<init>(Path.java:171)
at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
at org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:177)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:600)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
... 85 more
Caused by: java.net.URISyntaxException: Relative path in absolute URI: file:C:/Users/spark/spark-warehouse
at java.net.URI.checkPath(URI.java:1823)
at java.net.URI.<init>(URI.java:745)
at org.apache.hadoop.fs.Path.initialize(Path.java:202)
... 96 more
原因:java.lang.IllegalArgumentException:java.net.URISyntaxException:绝对URI中的相对路径:文件:C:/Users/spark/spark仓库
位于org.apache.hadoop.fs.Path.initialize(Path.java:205)
位于org.apache.hadoop.fs.Path(Path.java:171)
位于org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
位于org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:177)
位于org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:600)
位于org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
位于org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
位于org.apache.hadoop.hive.metastore.RetryingHMSHandler(RetryingHMSHandler.java:66)
位于org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
位于org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
位于org.apache.hadoop.hive.metastore.HiveMetaStoreClient。(HiveMetaStoreClient.java:199)
位于org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient。(SessionHiveMetaStoreClient.java:74)
... 85多
原因:java.net.URISyntaxException:绝对URI中的相对路径:文件:C:/Users/spark/spark warehouse
位于java.net.URI.checkPath(URI.java:1823)
位于java.net.URI。(URI.java:745)
位于org.apache.hadoop.fs.Path.initialize(Path.java:202)
... 96多
整个堆栈跟踪如下
16/08/16 12:36:20 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/08/16 12:36:20 WARN Hive: Failed to access metastore. This class should not accessed in runtime.
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:171)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:258)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:359)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:263)
at org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
at org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
at org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
at org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
at org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
at org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
at org.apache.spark.sql.hive.HiveSessionState$$anon$1.<init>(HiveSessionState.scala:63)
at org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
at org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:382)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:143)
at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:401)
at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:342)
at $line14.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:24)
at $line14.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:29)
at $line14.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:31)
at $line14.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:33)
at $line14.$read$$iw$$iw$$iw$$iw.<init>(<console>:35)
at $line14.$read$$iw$$iw$$iw.<init>(<console>:37)
at $line14.$read$$iw$$iw.<init>(<console>:39)
at $line14.$read$$iw.<init>(<console>:41)
at $line14.$read.<init>(<console>:43)
at $line14.$read$.<init>(<console>:47)
at $line14.$read$.<clinit>(<console>)
at $line14.$eval$.$print$lzycompute(<console>:7)
at $line14.$eval$.$print(<console>:6)
at $line14.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:415)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:923)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
at org.apache.spark.repl.Main$.doMain(Main.scala:68)
at org.apache.spark.repl.Main$.main(Main.scala:51)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)
... 74 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
... 80 more
Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:C:/Users/spark/spark-warehouse
at org.apache.hadoop.fs.Path.initialize(Path.java:205)
at org.apache.hadoop.fs.Path.<init>(Path.java:171)
at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
at org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:177)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:600)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
... 85 more
Caused by: java.net.URISyntaxException: Relative path in absolute URI: file:C:/Users/spark/spark-warehouse
at java.net.URI.checkPath(URI.java:1823)
at java.net.URI.<init>(URI.java:745)
at org.apache.hadoop.fs.Path.initialize(Path.java:202)
... 96 more
16/08/16 12:36:20警告对象存储:无法获取数据库默认值,返回NoSuchObjectException
16/08/16 12:36:20警告配置单元:无法访问元存储。不应在运行时访问此类。
org.apache.hadoop.hive.ql.metadata.HiveException:java.lang.RuntimeException:无法实例化org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
位于org.apache.hadoop.hive.ql.metadata.hive.getAllDatabases(hive.java:1236)
位于org.apache.hadoop.hive.ql.metadata.hive.reloadFunctions(hive.java:174)
位于org.apache.hadoop.hive.ql.metadata.hive(hive.java:166)
位于org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
位于org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:171)
位于sun.reflect.NativeConstructorAccessorImpl.newInstance0(本机方法)
位于sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
在sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
位于java.lang.reflect.Constructor.newInstance(Constructor.java:423)
位于org.apache.spark.sql.hive.client.IsolatedClient.createClient(IsolatedClient.scala:258)
位于org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:359)
位于org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:263)
位于org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
位于org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
位于org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
位于org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
位于org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
位于org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
位于org.apache.spark.sql.hive.HiveSessionState$$anon$1。(HiveSessionState.scala:63)
位于org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
位于org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
位于org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
位于org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
位于org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:382)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:143)
位于org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:401)
位于org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:342)
在$line14.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw。(:24)
在$line14.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw。(:29)
在$line14.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw。(:31)
在$line14.$read$$iw$$iw$$iw$$iw$$iw$$iw。(:33)
在$line14.$read$$iw$$iw$$iw$$iw。(:35)
在$line14。$read$$iw$$iw$$iw。(:37)
在$line14。$read$$iw$$iw。(:39)
在$line14。$read$$iw。(:41)
$line14.$read.(:43)
在$line14.$read$(:47)
第14行$read$()
在$line14.$eval$.$print$lzycompute(:7)
在$line14.$eval$.$print处(:6)
在$line14.$eval.$print()处
在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处
位于sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中
位于java.lang.reflect.Method.invoke(Method.java:498)
位于scala.tools.nsc.explorer.IMain$ReadEvalPrint.ca
spark-shell --conf spark.sql.warehouse.dir=file:///c:/tmp/spark-warehouse
import org.apache.spark.sql.SparkSession
SparkSession spark = SparkSession
.builder()
.config("spark.sql.warehouse.dir", "file:///c:/tmp/spark-warehouse")
.getOrCreate()
spark.sql.warehouse.dir file:///c:/tmp/spark-warehouse
System.setProperty(
"spark.sql.warehouse.dir",
s"file:///${System.getProperty("user.dir")}/spark-warehouse"
.replaceAll("\\\\", "/")
)
System.setProperty("spark.sql.warehouse.dir", "file:///C:/spark-warehouse");
System.setProperty("hadoop.home.dir", "c:/winutil/");
System.setProperty("spark.sql.warehouse.dir", "file:///C:/spark-warehouse");
val conf = new SparkConf().setAppName("test").setMaster("local[*]")
val sc = new SparkContext(conf)
val lines = sc.textFile("C:/user.txt")