Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
sparkR:如何从字符向量创建虚拟列?_R_Apache Spark_Sparkr_Grepl - Fatal编程技术网

sparkR:如何从字符向量创建虚拟列?

sparkR:如何从字符向量创建虚拟列?,r,apache-spark,sparkr,grepl,R,Apache Spark,Sparkr,Grepl,考虑以下简单示例: df <- data.frame(id=c(1:4), climate=c("cold_rainy","coldSunny","rainywarm","sunny_warm")) head(df) id climate 1 cold_rainy 2 coldSunny 3 rainywarm 4 sunny_warm 如何在sparkR中的SparkDataF

考虑以下简单示例:

df <- data.frame(id=c(1:4), climate=c("cold_rainy","coldSunny","rainywarm","sunny_warm"))
head(df)

       id    climate
       1     cold_rainy
       2     coldSunny
       3     rainywarm
       4     sunny_warm

如何在sparkR中的SparkDataFrame上实现此操作?

您可以首先将字符串值转换为小写,然后使用
rlike()
$climate
中查找
“sunny”
。因此,我们将布尔输出
cast()
转换为类型
integer

ddf <- createDataFrame(sqlContext, df)  # Data
ddf$climate <- lower(ddf$climate) # Convert to lowercase
ddf$sunny <- cast(rlike(ddf$climate, "sunny"), "integer") # Create integer column

> ddf
  id    climate sunny
1  1 cold_rainy     0
2  2  coldsunny     1
3  3  rainywarm     0
4  4 sunny_warm     1

ddf对我来说是完美的解决方案,非常好!
ddf <- createDataFrame(sqlContext, df)  # Data
ddf$climate <- lower(ddf$climate) # Convert to lowercase
ddf$sunny <- cast(rlike(ddf$climate, "sunny"), "integer") # Create integer column

> ddf
  id    climate sunny
1  1 cold_rainy     0
2  2  coldsunny     1
3  3  rainywarm     0
4  4 sunny_warm     1