Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 从PySpark连接到MSSQL_Apache Spark_Jdbc_Pyspark_Apache Spark Sql - Fatal编程技术网

Apache spark 从PySpark连接到MSSQL

Apache spark 从PySpark连接到MSSQL,apache-spark,jdbc,pyspark,apache-spark-sql,Apache Spark,Jdbc,Pyspark,Apache Spark Sql,我正在尝试使用spark.read.jdbc从PySpark连接到mssqlDB import os from pyspark.sql import * from pyspark.sql.functions import * from pyspark import SparkContext; from pyspark.sql.session import SparkSession sc = SparkContext('xx') spark = SparkSession(sc) spar

我正在尝试使用spark.read.jdbc从PySpark连接到mssqlDB

import os
from pyspark.sql import *
from pyspark.sql.functions import *
from pyspark import SparkContext;
from pyspark.sql.session import SparkSession
sc = SparkContext('xx')
spark = SparkSession(sc)

    spark.read.jdbc('DESKTOP-XXXX\SQLEXPRESS',
"""(select COL1, COL2 from tbl1 WHERE COL1 = 2) """,
properties={'user': sa, 'password': 12345, 'driver': xxxx})

我不知道
sc=SparkContext('xx')
'driver':xxxx
我应该传递哪些参数?

用您的数据库地址替换
serveraddress

sc = SparkContext()
spark = SparkSession(sc)
spark.read \
     .format('jdbc') \
     .option('url', 'jdbc:sqlserver://serveraddress:1433') \
     .option('user', 'sa') \
     .option('password', '12345') \
     .option('dbtable', '(select COL1, COL2 from tbl1 WHERE COL1 = 2)')

代码工作,然后我加载df:File“.\spark\python\pyspark\sql\readwriter.py”,第172行,在load-return-self.\u df(self.\u jreader.load())文件“.\spark\python\lib\py4j-0.10.7-src.zip\py4j\java\u gateway.py”,第1256行,在调用文件“.\spark\python\pyspark\pyspark\sql\utils.py”,第63行,在deco-return文件(*中“.\spark\python\lib\py4j-0.10.7-src.zip\py4j\protocol.py”,第326行,在get_return_value py4j.protocol.Py4JJavaError:调用o42.load时出错:java.sql.SQLException:java.sql.DriverManager.getDriver处没有合适的驱动程序(未知源)这是因为您的spark/jars文件夹中没有SQL server JDBC驱动程序。请参阅我在中看到的答案。当我下载时,在下载rar文件中有许多文件,我应该只在spark/jars文件夹中放入3个.jar文件?
当您下载驱动程序时,有多个jar文件。jar文件的名称表示所使用的Java版本它支持。
根据系统中的java版本选择适当的jar文件,并仅将其放入spark/jars文件夹中