Regex 如何从字符串Scala的前面和结尾删除引号
我有一个数据帧,其中一些字符串的前面和结尾都包含“” 例如: 预期产出:Regex 如何从字符串Scala的前面和结尾删除引号,regex,string,scala,apache-spark,trim,Regex,String,Scala,Apache Spark,Trim,我有一个数据帧,其中一些字符串的前面和结尾都包含“” 例如: 预期产出: +-------------------------------+ |data | +-------------------------------+ |john belushi | |john mnunjnj | |nmnj tyhng | |John b-e
+-------------------------------+
|data |
+-------------------------------+
|john belushi |
|john mnunjnj |
|nmnj tyhng |
|John b-e_lushi |
|john belushi's book |
我试图从字符串中删除“双引号”。有人能告诉我如何在Scala中删除它吗
Python提供了ltrim和rtrim。在Scala中是否有类似的东西
如何从字符串Scala的前端和后端删除引号
myString.substring(1,myString.length()-1)
将删除双引号
import spark.implicits._
val list = List("\"hi\"", "\"I am learning scala\"", "\"pls\"", "\"help\"").toDF()
list.show(false)
val finaldf = list.map {
row => {
val stringdoublequotestoberemoved = row.getAs[String]("value")
stringdoublequotestoberemoved.substring(1, stringdoublequotestoberemoved.length() - 1)
}
}
finaldf.show(false)
结果:
+--------------------+
| value|
+--------------------+
| "hi"|
|"I am learning sc...|
| "pls"|
| "help"|
+--------------------+
+-------------------+
| value|
+-------------------+
| hi|
|I am learning scala|
| pls|
| help|
+-------------------+
使用expr、substring和length函数,从
2
和length()-2
val df_d = List("\"john belushi\"", "\"John b-e_lushi\"", "\"john belushi's book\"")
.toDF("data")
输入:
+---------------------+
|data |
+---------------------+
|"john belushi" |
|"John b-e_lushi" |
|"john belushi's book"|
+---------------------+
import org.apache.spark.sql.functions.expr
df_d.withColumn("data", expr("substring(data, 2, length(data) - 2)"))
.show(false)
+-------------------+
|data |
+-------------------+
|john belushi |
|John b-e_lushi |
|john belushi's book|
+-------------------+
使用expr、substring和length函数:
+---------------------+
|data |
+---------------------+
|"john belushi" |
|"John b-e_lushi" |
|"john belushi's book"|
+---------------------+
import org.apache.spark.sql.functions.expr
df_d.withColumn("data", expr("substring(data, 2, length(data) - 2)"))
.show(false)
+-------------------+
|data |
+-------------------+
|john belushi |
|John b-e_lushi |
|john belushi's book|
+-------------------+
输出:
+---------------------+
|data |
+---------------------+
|"john belushi" |
|"John b-e_lushi" |
|"john belushi's book"|
+---------------------+
import org.apache.spark.sql.functions.expr
df_d.withColumn("data", expr("substring(data, 2, length(data) - 2)"))
.show(false)
+-------------------+
|data |
+-------------------+
|john belushi |
|John b-e_lushi |
|john belushi's book|
+-------------------+
试试看
“\”hello\”world.replaceAll(“\”,”)
或“\”hello\”world.filterNot(“=”)有什么问题< /代码>?它与<代码>修饰符< /代码>有什么关系?似乎你正在到处删除引号,也在字符串的中间,从你的编辑中删除,为什么第一个注释的解决方案不起作用还不清楚。你想保留字符串中间的引号吗?@是的,我想删除O。只有字符串前后的引号才能回答您的问题?只需将正则表达式中的“
替换为”
:“\'a\'b\'c\”。replaceAll((^\”)(\“$”,“”)
。