Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 将Spark数据帧写入红移:保存StructField(用户\代理,ArrayType(StringType,true),true)_Apache Spark_Dataframe_Amazon Redshift - Fatal编程技术网

Apache spark 将Spark数据帧写入红移:保存StructField(用户\代理,ArrayType(StringType,true),true)

Apache spark 将Spark数据帧写入红移:保存StructField(用户\代理,ArrayType(StringType,true),true),apache-spark,dataframe,amazon-redshift,Apache Spark,Dataframe,Amazon Redshift,我有一个数据帧,模式包括一个数组[String]字段: StructField("user_agent", ArrayType apply (StringType, true)) ... myDataframe.printSchema (an excerpt) |-- user_agent: array (nullable = true) | |-- element: string (containsNull = true) 我正在使用com.databricks.s

我有一个数据帧,模式包括一个数组[String]字段:

 StructField("user_agent", ArrayType apply (StringType, true))

   ...
   myDataframe.printSchema
(an excerpt)
 |-- user_agent: array (nullable = true)
 |    |-- element: string (containsNull = true)
我正在使用com.databricks.spark.redshift包写入redshift。我得到一个错误:

java.lang.IllegalArgumentException: Don't know how to save StructField(user_agent,ArrayType(StringType,true),true) to JDBC
        at com.databricks.spark.redshift.JDBCWrapper$$anonfun$schemaString$1.apply(RedshiftJDBCWrapper.scala:253)
        at com.databricks.spark.redshift.JDBCWrapper$$anonfun$schemaString$1.apply(RedshiftJDBCWrapper.scala:233)

是否可以使用此软件包将此类数据类型写入Redshift?

spark Redshift支持以下数据类型:

field.dataType match {
          case IntegerType => "INTEGER"
          case LongType => "BIGINT"
          case DoubleType => "DOUBLE PRECISION"
          case FloatType => "REAL"
          case ShortType => "INTEGER"
          case ByteType => "SMALLINT" // Redshift does not support the BYTE type.
          case BooleanType => "BOOLEAN"
          case StringType =>
            if (field.metadata.contains("maxlength")) {
              s"VARCHAR(${field.metadata.getLong("maxlength")})"
            } else {
              "TEXT"
            }
          case TimestampType => "TIMESTAMP"
          case DateType => "DATE"
          case t: DecimalType => s"DECIMAL(${t.precision},${t.scale})"
          case _ => throw new IllegalArgumentException(s"Don't know how to save $field to JDBC")
}

我也遇到了同样的问题,最终将数组转换为字符串。