R 转换逗号';s指向点并作为数字,但仅限于一定数量的变量

R 转换逗号';s指向点并作为数字,但仅限于一定数量的变量,r,csv,numeric,comma,R,Csv,Numeric,Comma,所以我有一个df,看起来像这样,数值被分割为逗号而不是点,它们被分类为字符 var0 <- c("There, are commas", "in the text, string", "as,well", "how, can", "i", "fix, this", "thank you") var1 <- c("50,0",

所以我有一个df,看起来像这样,数值被分割为逗号而不是点,它们被分类为字符

var0 <- c("There, are commas", "in the text, string", "as,well", "how, can", "i", "fix, this", "thank you")
var1 <- c("50,0", "72,0", "960,0", "1.920,0", "50,0", "50,0", "960,0")
var2 <- c("40,0", "742,0", "9460,0", "1.920,0", "50,0", "50,0", "960,0")
var3<- c("40,0", "72,0", "90,0", "1,30", "50,0", "50,0", "960,0")
...
var96 <- c("40,0", "742,0", "9460,0", "1.920,0", "50,0", "50,0", "960,0")

df <- data.frame(cbind(var0, var1, var2, var3))

var0tidyverse软件包很适合做这种事情

library(tidyverse)
df <- df %>% 
      # First, remove the points in your numbers b/c otherwise, you'll end up
      # with, e.g., "1.920.0"
      mutate_all(.fun = function(x) gsub("\\.", "", x)) %>% 
      # Next, replace all the commas with points and convert to numeric. Only do
      # this for the columns that don't contain text, though.
      mutate_at(.vars = vars(matches("var[1-3]")), 
                .fun = function(x) as.numeric(gsub(",", "\\.", x)))
库(tidyverse)
df%
#首先,删除你的数字b/c中的点,否则,你将结束
#例如,“1.920.0”
mutate_all(.fun=函数(x)gsub(“\\.”,“,”,x))%>%
#接下来,将所有逗号替换为点并转换为数字。只做
#不过,这适用于不包含文本的列。
在(.vars=vars(匹配(“var[1-3]”)处进行变异,
.fun=函数(x)为.numeric(gsub(“,”,“\\”,x)))

请注意,在
mutate_at
调用中,我假设只有列“var0”包含您希望保留的文本,并且我将与正则表达式“var[1-3]”匹配的任何内容转换为数字数据,并使用点而不是逗号。您需要根据自己的情况调整正则表达式。

tidyverse包非常适合这种情况

library(tidyverse)
df <- df %>% 
      # First, remove the points in your numbers b/c otherwise, you'll end up
      # with, e.g., "1.920.0"
      mutate_all(.fun = function(x) gsub("\\.", "", x)) %>% 
      # Next, replace all the commas with points and convert to numeric. Only do
      # this for the columns that don't contain text, though.
      mutate_at(.vars = vars(matches("var[1-3]")), 
                .fun = function(x) as.numeric(gsub(",", "\\.", x)))
库(tidyverse)
df%
#首先,删除你的数字b/c中的点,否则,你将结束
#例如,“1.920.0”
mutate_all(.fun=函数(x)gsub(“\\.”,“,”,x))%>%
#接下来,将所有逗号替换为点并转换为数字。只做
#不过,这适用于不包含文本的列。
在(.vars=vars(匹配(“var[1-3]”)处进行变异,
.fun=函数(x)为.numeric(gsub(“,”,“\\”,x)))

请注意,在
mutate_at
调用中,我假设只有列“var0”包含您希望保留的文本,并且我将与正则表达式“var[1-3]”匹配的任何内容转换为数字数据,并使用点而不是逗号。您需要根据您的情况调整正则表达式。

这里有一个函数,它仅用小数点替换逗号,如果所有字符都是数字0-9、点和逗号,则删除所有其他点

commas2dots <- function(x){
  if(any(grepl("[^\\.,[:digit:]]", x))){
    x
  } else {
    y <- gsub("\\.", "", x)
    tc <- textConnection(y)
    on.exit(close(tc))
    scan(tc, dec = ",", quiet = TRUE)
  }
}

lapply(df, commas2dots)
#$var0
#[1] "There, are commas"   "in the text, string"
#[3] "as,well"             "how, can"           
#[5] "i"                   "fix, this"          
#[7] "thank you"          
#
#$var1
#[1]   50   72  960 1920   50   50  960
#
#$var2
#[1]   40  742 9460 1920   50   50  960
#
#$var3
#[1]  40.0  72.0  90.0   1.3  50.0  50.0 960.0
#
#$var96
#[1]   40  742 9460 1920   50   50  960

commas2dots这里有一个函数,如果所有字符都是数字0-9、点和逗号,则该函数仅用小数点替换逗号,并删除所有其他点

commas2dots <- function(x){
  if(any(grepl("[^\\.,[:digit:]]", x))){
    x
  } else {
    y <- gsub("\\.", "", x)
    tc <- textConnection(y)
    on.exit(close(tc))
    scan(tc, dec = ",", quiet = TRUE)
  }
}

lapply(df, commas2dots)
#$var0
#[1] "There, are commas"   "in the text, string"
#[3] "as,well"             "how, can"           
#[5] "i"                   "fix, this"          
#[7] "thank you"          
#
#$var1
#[1]   50   72  960 1920   50   50  960
#
#$var2
#[1]   40  742 9460 1920   50   50  960
#
#$var3
#[1]  40.0  72.0  90.0   1.3  50.0  50.0 960.0
#
#$var96
#[1]   40  742 9460 1920   50   50  960

commas2dots创建data.frame时,不需要
cbind
。创建data.frame时,不需要
cbind