Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala Spark-替换列值-正则表达式模式值有斜杠值-如何处理?_Scala_Apache Spark_Apache Spark Sql - Fatal编程技术网

Scala Spark-替换列值-正则表达式模式值有斜杠值-如何处理?

Scala Spark-替换列值-正则表达式模式值有斜杠值-如何处理?,scala,apache-spark,apache-spark-sql,Scala,Apache Spark,Apache Spark Sql,数据帧: +-------------------+-------------------+ | Desc| replaced_columns| +-------------------+-------------------+ |India is my Country|India is my Country| | Delhi is my Nation| Delhi is my Nation| | I Love India\Delhi| I Love India\

数据帧:

+-------------------+-------------------+
|               Desc|   replaced_columns|
+-------------------+-------------------+
|India is my Country|India is my Country|
| Delhi is my Nation| Delhi is my Nation|
| I Love India\Delhi| I Love India\Delhi|
|         I Love USA|         I Love USA|
|I am stay in USA\SA|I am stay in USA\SA|
+-------------------+-------------------+
“Desc”列是DataFrame中的原始列名。替换_列是在我们进行一些转换之后。在desc列中,我需要将“India\drih”值替换为“-”。我尝试了下面的代码

dataDF.withColumn("replaced_columns", regexp_replace(dataDF("Desc"), "India\\Delhi", "-")).show() 
val approach3 = dataDF
   .withColumn("replaced_columns",when(col("Desc").like("%Delhi")
     , regexp_replace(col("Desc"), "\\\\", "-")).otherwise(col("Desc")))
    .show()

它不会替换为“-”字符串。我该怎么做呢?

我找到了解决上述问题的三种方法:

val approach1 = dataDF.withColumn("replaced_columns", regexp_replace(col("Desc"), "\\\\","-")).show() // (it should be 4 backslash in actual while running in IDE)

val approach2 = dataDF.select($"Desc",translate($"Desc","\\","-").as("replaced_columns")).show()
下面的一个是针对您在上面询问的特定记录——(在
desc
列中,我需要将
“India\drish”
值替换为
“-”
。我尝试了下面的代码。)


将“-\\\\”替换为“-”将不是正确的解决方案。。它将导致其他具有“\”字符的文本字符串。尝试使用方法2它在我的IDE中工作。所以我照样抄了。不知道为什么它对您不起作用例如“having 2\3 inc.”是列值之一。。它被改为“拥有2-3公司”。所以我们不能;不要应用这种转换,