Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 如何使用dataframe处理列中的不同类型_Scala_Apache Spark - Fatal编程技术网

Scala 如何使用dataframe处理列中的不同类型

Scala 如何使用dataframe处理列中的不同类型,scala,apache-spark,Scala,Apache Spark,我有一张像下面这样的桌子 +----------+-----+ | tmp|index| +----------+-----+ | [user1,0]| 0| | [user1,3]| 1| |[user1,15]| 2| 我要将tmp列拆分为两列。tmp是字符串类型,索引是Int 我写udf如下 val getUser_id = udf( ( s : (String, Int)) => { s._1 }) newSession.withColumn(

我有一张像下面这样的桌子

+----------+-----+
|       tmp|index|
+----------+-----+
| [user1,0]|    0|
| [user1,3]|    1|
|[user1,15]|    2|
我要将tmp列拆分为两列。tmp是字符串类型,索引是Int

我写udf如下

val getUser_id = udf( ( s : (String, Int)) => {
  s._1
})
newSession.withColumn( "user_id", getUser_id($"tmp"))
结果是:

无法执行用户定义的函数(anonfun$4:(struct)=>string)


您需要帮助吗?

应该是
而不是
元组

import org.apache.spark.sql.Row

val getUser_id = udf( ( s: Row) => {
  s.getString(0)
})

但是在这里,您应该
选择

newSession.withColumn( "user_id", getUser_id($"tmp._1"))
newSession.withColumn( "user_id", getUser_id($"tmp._1"))