Scala 写入增量表时检测到架构不匹配-Azure DataRicks_Scala_Azure Databricks_Delta Lake

Scala 写入增量表时检测到架构不匹配-Azure DataRicks

scala

Scala 写入增量表时检测到架构不匹配-Azure DataRicks,scala,azure-databricks,delta-lake,Scala,Azure Databricks,Delta Lake,我尝试将“small_radio_json.json”加载到Delta Lake表中。在这段代码之后，我将创建一个表我尝试创建增量表，但出现错误“写入增量表时检测到架构不匹配。” 它可能和events.write.format（“delta”）.mode（“overwrite”）.partitionBy（“artist”）.save（“/delta/events/”）的分区有关如何修复或修改代码 //https://docs.microsoft.com/en-us/azure/azure-d

我尝试将“small_radio_json.json”加载到Delta Lake表中。在这段代码之后，我将创建一个表

我尝试创建增量表，但出现错误“写入增量表时检测到架构不匹配。” 它可能和

events.write.format（“delta”）.mode（“overwrite”）.partitionBy（“artist”）.save（“/delta/events/”）的分区有关
如何修复或修改代码
//https://docs.microsoft.com/en-us/azure/azure-databricks/databricks-extract-load-sql-data-warehouse
//https://docs.microsoft.com/en-us/azure/databricks/_static/notebooks/delta/quickstart-scala.html
//会话配置
val appID=“123558b9-3525-4c62-8c48-D3D7E2C16A”
val secret=“123[xEPjpOIBJtBS-W9B9Zsv7h9IF:qw”
val tenantID=“12344839-0afa-4fae-a34a-326c42112bca”
spark.conf.set（“fs.azure.account.auth.type”，“OAuth”）
spark.conf.set（“fs.azure.account.oauth.provider.type”，
“org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider”）
spark.conf.set（“fs.azure.account.oauth2.client.id”，”）
spark.conf.set（“fs.azure.account.oauth2.client.secret”，”）
spark.conf.set（“fs.azure.account.oauth2.client.endpoint”https://login.microsoftonline.com//oauth2/token")
spark.conf.set（“fs.azure.createRemoteFileSystemDuringInitialization”，“true”）
//帐户信息
val storageAccountName=“mydatalake”
val fileSystemName=“fileshare1”
spark.conf.set（“fs.azure.account.auth.type.+storageAccountName+”.dfs.core.windows.net，“OAuth”）
spark.conf.set（“fs.azure.account.oauth.provider.type”。+storageAccountName+
“.dfs.core.windows.net”、“org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider”）
spark.conf.set（“fs.azure.account.oauth2.client.id.”+storageAccountName+“.dfs.core.windows.net”，
“+appID+”）
spark.conf.set（“fs.azure.account.oauth2.client.secret”。+storageAccountName+
“.dfs.core.windows.net”、“+secret+”）
spark.conf.set（“fs.azure.account.oauth2.client.endpoint”。+storageAccountName+
“.dfs.core.windows.net”https://login.microsoftonline.com/“+tenantID+”/oauth2/token”）
spark.conf.set（“fs.azure.createRemoteFileSystemDuringInitialization”，“true”）
dbutils.fs.ls（“abfss://“+fileSystemName+”@“+storageAccountName+”.dfs.core.windows.net/”）
spark.conf.set（“fs.azure.createRemoteFileSystemDuringInitialization”，“false”）
dbutils.fs.cp（“file:///tmp/small_radio_json.json“，”abfss://“+文件系统名+“@”+
storageAccountName+“.dfs.core.windows.net/”）
val df=spark.read.json（“abfss://”+文件系统名+“@”+存储帐户名+
“.dfs.core.windows.net/small\u radio\u json.json”）
//df.show（）
导入org.apache.spark.sql_
导入org.apache.spark.sql.functions_
val事件=df
显示（事件）
导入org.apache.spark.sql.SaveMode
events.write.format（“delta”）.mode（“覆盖”）.partitionBy（“艺术家”）.save（“/delta/events/”）
导入org.apache.spark.sql.SaveMode
val events_delta=spark.read.format（“delta”）.load（“/delta/events/”）
显示（事件增量）

例外情况：
    org.apache.spark.sql.AnalysisException: A schema mismatch detected when writing to the Delta table.
    To enable schema migration, please set:
    '.option("mergeSchema", "true")'.

    Table schema:
    root
    -- action: string (nullable = true)
    -- date: string (nullable = true)


    Data schema:
    root
    -- artist: string (nullable = true)
    -- auth: string (nullable = true)
    -- firstName: string (nullable = true)
    -- gender: string (nullable = true)

很可能/delta/events/
没有任何数据，因此从同一目录加载数据时，可能会出现此类异常。
您会遇到架构不匹配错误，因为表中的列与数据框中的列不同
根据问题中粘贴的错误快照，表架构只有两列，而dataframe架构有四列：
Table schema:
root
-- action: string (nullable = true)
-- date: string (nullable = true)


Data schema:
root
-- artist: string (nullable = true)
-- auth: string (nullable = true)
-- firstName: string (nullable = true)
-- gender: string (nullable = true)

现在你有两个选择
如果要保留数据帧中存在的模式，可以将overwriteSchema
选项添加到true
如果要保留所有列，可以将mergeSchema
选项设置为true。在这种情况下，它将合并架构，现在表将有六列，即数据框中的两个现有列和四个新列