Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/356.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将pyspark连接到MS-SQL数据库时出现问题_Python_Pyspark - Fatal编程技术网

Python 将pyspark连接到MS-SQL数据库时出现问题

Python 将pyspark连接到MS-SQL数据库时出现问题,python,pyspark,Python,Pyspark,我正在尝试使用pyspark连接到MS-SQL数据库: "jdbcDF = spark.read.format("jdbc") \ .option("url", "jdbc:sqlserver://localhost:1433;databaseName=xxx") \ .option("dbtable", "xxx") \ .option("user", "xxx") \ .option("password", "xxx").load()" 但是我得到了以下错误

我正在尝试使用pyspark连接到MS-SQL数据库:

"jdbcDF = spark.read.format("jdbc") \
    .option("url", "jdbc:sqlserver://localhost:1433;databaseName=xxx") \
.option("dbtable", "xxx") \

    .option("user", "xxx") \
    .option("password", "xxx").load()"  
但是我得到了以下错误

Error occurring post execution is: Py4JJavaError: An error occurred while calling o148.load.
: java.sql.SQLException: No suitable driver

如何解决此问题?

您需要将驱动程序复制到python脚本所在的文件夹中,并且必须使用
定义驱动程序。设置
选项如下:

.set("spark.driver.extraClassPath","mssql-jdbc-7.4.1.jre8.jar")
from pyspark import SparkContext, SparkConf, SQLContext

appName = "PySpark SQL Server Example - via JDBC"
master = "local"
conf = SparkConf() \
    .setAppName(appName) \
    .setMaster(master) \
    .set("spark.driver.extraClassPath","mssql-jdbc-7.4.1.jre8.jar")
sc = SparkContext.getOrCreate(conf=conf)
sqlContext = SQLContext(sc)
spark = sqlContext.sparkSession

# Loading data from a JDBC source
jdbcDF = spark.read \
    .format("jdbc") \
    .option("url", "jdbc:postgresql:dbserver") \
    .option("url", "jdbc:sqlserver://188.188.188.188:10004;databaseName=dbnme") \
    .option("dbtable", "dbo.tablename") \
    .option("user", "usernmame") \
    .option("password", "pawwrod") \
    .load()
spark上下文、连接到ms sql和从ms sql中选择1个表的完整代码如下所示:

.set("spark.driver.extraClassPath","mssql-jdbc-7.4.1.jre8.jar")
from pyspark import SparkContext, SparkConf, SQLContext

appName = "PySpark SQL Server Example - via JDBC"
master = "local"
conf = SparkConf() \
    .setAppName(appName) \
    .setMaster(master) \
    .set("spark.driver.extraClassPath","mssql-jdbc-7.4.1.jre8.jar")
sc = SparkContext.getOrCreate(conf=conf)
sqlContext = SQLContext(sc)
spark = sqlContext.sparkSession

# Loading data from a JDBC source
jdbcDF = spark.read \
    .format("jdbc") \
    .option("url", "jdbc:postgresql:dbserver") \
    .option("url", "jdbc:sqlserver://188.188.188.188:10004;databaseName=dbnme") \
    .option("dbtable", "dbo.tablename") \
    .option("user", "usernmame") \
    .option("password", "pawwrod") \
    .load()

您可以按照教程正确设置与MS SQL的连接。

您好,请尝试将驱动程序的选项添加到查询中好吗?例如,
.option(“driver”、“com.microsoft.sqlserver.jdbc.SQLServerDriver”)
此外,您的路径中是否有驱动程序?如果没有,从这里下载它并将其作为一个jarThank@mkaran,一个小问题是这个jar文件应该放在我正在访问的服务器上吗?:)这个jar应该可以从所有节点访问,我想如果你把它放在主节点上并使用
-jar路径/to/jar/actual.jar
,那么这个jar将被复制到所有节点,所以它应该会起作用。不客气,如果一切正常,请告诉我:)