Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/jquery-ui/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
要转换的r代码';缺失';将字符设置为0_R_Type Conversion_Dataset - Fatal编程技术网

要转换的r代码';缺失';将字符设置为0

要转换的r代码';缺失';将字符设置为0,r,type-conversion,dataset,R,Type Conversion,Dataset,我只知道R的基本知识,并试图学习更深入的知识,请帮助我 这就是我的数据的外观 预计产量为 在此之前,我使用下面的代码随机放置缺失的值,现在我需要对该数据执行1-NN,但不能导致非数值参数的存在 >data5=data5 %>% select(1:5) >df.new5 <- data10[-sample(NROW(data10), NROW(data10)*(1 - 0.05),),]<-'missing' >df.new5 <- data10[sam

我只知道R的基本知识,并试图学习更深入的知识,请帮助我

这就是我的数据的外观 预计产量为 在此之前,我使用下面的代码随机放置缺失的值,现在我需要对该数据执行1-NN,但不能导致非数值参数的存在
>data5=data5 %>% select(1:5) 
>df.new5 <- data10[-sample(NROW(data10), NROW(data10)*(1 - 0.05),),]<-'missing'
>df.new5 <- data10[sample(NROW(data10), NROW(data10)*(1 - 0.05),),]
>data5=data5%>%选择(1:5)

>df.new5如果您的列是factors,请先将它们更改为字符,然后将所有
“缺失”
值替换为0,使用
type.convert
转换类

df[] <- lapply(df, as.character)
df[-1][df[-1] == "missing"] <- 0
df <- type.convert(df)
df

#      Beach.Name Water.Temperature Turbidity Wave.Height Wave.Period
#7   CalumetBeach              16.3      1.28       0.162           4
#72       missing               0.0      0.00       0.000           0
#18  CalumetBeach              17.0      1.82       0.194           4
#78       missing               0.0      0.00       0.000           0
#24 MontroseBeach              14.5      6.88       0.345           4
#41  CalumetBeach              15.9      1.74       0.148           4
使用
NA
s

df[-1] <- lapply(df[-1], function(x) as.numeric(as.character(x)))
sapply(df[-1], mean, na.rm = TRUE)

#Water.Temperature         Turbidity       Wave.Height       Wave.Period 
#            15.93              2.93              0.21              4.00  

df[-1]您确定要0而不是
NA
NA
将允许您在不引入虚假数据的情况下进行所有计算。如果同意的话,只需将每列的类更改为numeric,“missing”将被强制为
NA
@SmruthiDabbiru您能详细说明
不能
?什么不起作用?你运行了
df[]吗是的,我做了,也许我做错了什么,我会再做一次
df[] <- lapply(df, as.character)
df[-1][df[-1] == "missing"] <- 0
df <- type.convert(df)
df

#      Beach.Name Water.Temperature Turbidity Wave.Height Wave.Period
#7   CalumetBeach              16.3      1.28       0.162           4
#72       missing               0.0      0.00       0.000           0
#18  CalumetBeach              17.0      1.82       0.194           4
#78       missing               0.0      0.00       0.000           0
#24 MontroseBeach              14.5      6.88       0.345           4
#41  CalumetBeach              15.9      1.74       0.148           4
sapply(df[-1], mean, na.rm = TRUE)
#Water.Temperature         Turbidity       Wave.Height       Wave.Period 
#            10.62              1.95              0.14              2.67 
df[-1] <- lapply(df[-1], function(x) as.numeric(as.character(x)))
sapply(df[-1], mean, na.rm = TRUE)

#Water.Temperature         Turbidity       Wave.Height       Wave.Period 
#            15.93              2.93              0.21              4.00  
df <- structure(list(Beach.Name = structure(c(1L, 2L, 1L, 2L, 3L, 1L
), .Label = c("CalumetBeach", "missing", "MontroseBeach"), class = "factor"), 
Water.Temperature = structure(c(3L, 5L, 4L, 5L, 1L, 2L), .Label = c("14.5", 
"15.9", "16.3", "17", "missing"), class = "factor"), Turbidity = structure(c(1L, 
5L, 3L, 5L, 4L, 2L), .Label = c("1.28", "1.74", "1.82", "6.88", 
"missing"), class = "factor"), Wave.Height = structure(c(2L, 
5L, 3L, 5L, 4L, 1L), .Label = c("0.148", "0.162", "0.194", 
"0.345", "missing"), class = "factor"), Wave.Period = structure(c(1L, 
2L, 1L, 2L, 1L, 1L), .Label = c("4", "missing"), 
class = "factor")), class = "data.frame", 
row.names = c("7", "72", "18", "78", "24", "41"))