Apache spark 从hdfs读取并写入oracle 12
嗨 我正在尝试使用pyspark从hdfs读取数据并在oracle中编写,但我 有个错误。我附加了我正在使用的代码和我发现的错误 获取: …显示的错误是:Apache spark 从hdfs读取并写入oracle 12,apache-spark,pyspark-sql,Apache Spark,Pyspark Sql,嗨 我正在尝试使用pyspark从hdfs读取数据并在oracle中编写,但我 有个错误。我附加了我正在使用的代码和我发现的错误 获取: …显示的错误是: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/spark/python/pyspark/sql/readwriter.py", line 530,
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/spark/python/pyspark/sql/readwriter.py", line 530, in jdbc
self._jwrite.mode(mode).jdbc(url, table, jprop)
File "/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
File "/usr/lib/spark/python/pyspark/sql/utils.py", line 45, in deco
return f(*a, **kw)
File "/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o66.jdbc.
: java.sql.SQLException: Invalid Oracle URL specified
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:453)
at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:45)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:61)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:52)
at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:278)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:748)
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
jdbc中的文件“/usr/lib/spark/python/pyspark/sql/readwriter.py”,第530行
self._jwrite.mode(mode).jdbc(url、表格、jprop)
文件“/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py”,第813行,在__调用中__
文件“/usr/lib/spark/python/pyspark/sql/utils.py”,第45行,deco格式
返回f(*a,**kw)
文件“/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py”,第308行,在get_return_值中
py4j.protocol.Py4JJavaError:调用o66.jdbc时出错。
:java.sql.SQLException:指定的Oracle URL无效
位于oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:453)
位于org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:45)
位于org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:61)
位于org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$2.apply(JdbcUtils.scala:52)
位于org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:278)
在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处
位于sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中
位于java.lang.reflect.Method.invoke(Method.java:498)
位于py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
位于py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
在py4j.Gateway.invoke处(Gateway.java:259)
位于py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
在py4j.commands.CallCommand.execute(CallCommand.java:79)
在py4j.GatewayConnection.run处(GatewayConnection.java:209)
运行(Thread.java:748)
PD:I使用spark 1.6.0的Url应以“服务”格式指定,即
jdbc:oracle:thin:@//myhost:1521/orcl
jdbc:oracle:thin:@//myhost:1521/orcl