R:矢量化函数时跳过空元素
嗨,我正在努力学习R中的矢量化 我有以下代码:R:矢量化函数时跳过空元素,r,vectorization,data-cleaning,R,Vectorization,Data Cleaning,嗨,我正在努力学习R中的矢量化 我有以下代码: set.seed(23) obs_num=100 Observation=seq(1,obs_num) Location_Type1=sample(1:2, obs_num, replace=T) Location_Type2=sample(1:2, obs_num, replace=T) # The above does not lead to any errors #Location_Type2=sample(1, obs_num, rep
set.seed(23)
obs_num=100
Observation=seq(1,obs_num)
Location_Type1=sample(1:2, obs_num, replace=T)
Location_Type2=sample(1:2, obs_num, replace=T)
# The above does not lead to any errors
#Location_Type2=sample(1, obs_num, replace=T)
##Error occurs when I use this formula instead.
low_bound = runif(obs_num,0,1)
mean = runif(obs_num,10,15)
df1= data.frame(Observation,Location_Type1,Location_Type2,mean,low_bound)
Vectorized_function=function(data){
#Create groups
i1= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 1
i2= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 1
i3= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 2
i4= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 2
#Draw values
data[i1, "draw_value"] <- rtruncnorm(sum(i1),a=data[i1,'low_bound'],mean = data[i1, "mean"])
data[i2, "draw_value"] <- rtruncnorm(sum(i2),a=data[i2,'low_bound'],mean = data[i2, "mean"])
data[i3, "draw_value"] <- rtruncnorm(sum(i3),a=data[i3,'low_bound'],mean = data[i3, "mean"])
data[i4, "draw_value"] <- rtruncnorm(sum(i4),a=data[i4,'low_bound'],mean = data[i4, "mean"])
data
}
getvalue = Vectorized_function(data=df1)
在这种情况下,我会得到一个错误,即:
rtruncnorm中的错误(总和(i3),a=数据[i3,“下限”],平均值=数据[i3,:
长度(a)>0不是真的
我可以看到发生了什么。基本上,不存在任何满足条件i3和i4(即总和(i3)和总和(i4)=0)的观测值。在这种情况下,下限部分(“代码中的a”)会引起问题
有人可以建议如何确保我可以在代码中处理这些情况。我希望向量化函数能够处理任何条件为空的情况。在@akrun的评论之后,我对函数进行了如下调整:
Vectorized_function=function(data){
#Create groups
i1= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 1
i2= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 1
i3= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 2
i4= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 2
#Draw values
data[i1, "draw_value"] <- try(rtruncnorm(sum(i1),a=data[i1,'low_bound'],mean = data[i1, "mean"]),silent = T)
data[i2, "draw_value"] <- try(rtruncnorm(sum(i2),a=data[i2,'low_bound'],mean = data[i2, "mean"]),silent = T)
data[i3, "draw_value"] <- try(rtruncnorm(sum(i3),a=data[i3,'low_bound'],mean = data[i3, "mean"]),silent = T)
data[i4, "draw_value"] <- try(rtruncnorm(sum(i4),a=data[i4,'low_bound'],mean = data[i4, "mean"]),silent = T)
data
}
矢量化函数=函数(数据){
#创建组
i1=数据[[“位置类型1”]==1和数据[[“位置类型2”]==1
i2=数据[[“位置类型1”]==2和数据[[“位置类型2”]==1
i3=数据[[“位置类型1”]==1和数据[[“位置类型2”]==2
i4=数据[[“位置类型1”]==2和数据[[“位置类型2”]==2
#绘制值
data[i1,“draw_value”]你能用tryCatch
换行吗?你能用set.seed
生成返回错误的示例数据吗?我尝试过多次生成数据,但函数为我运行时没有错误。@RonakShah你是否将Location_Type2更改为Location_Type2=sample(1,obs_num,replace=T)?如果没有此更改,代码将无错误地运行。我已添加种子并对代码进行了更多解释,以供澄清。
Vectorized_function=function(data){
#Create groups
i1= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 1
i2= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 1
i3= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 2
i4= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 2
#Draw values
data[i1, "draw_value"] <- try(rtruncnorm(sum(i1),a=data[i1,'low_bound'],mean = data[i1, "mean"]),silent = T)
data[i2, "draw_value"] <- try(rtruncnorm(sum(i2),a=data[i2,'low_bound'],mean = data[i2, "mean"]),silent = T)
data[i3, "draw_value"] <- try(rtruncnorm(sum(i3),a=data[i3,'low_bound'],mean = data[i3, "mean"]),silent = T)
data[i4, "draw_value"] <- try(rtruncnorm(sum(i4),a=data[i4,'low_bound'],mean = data[i4, "mean"]),silent = T)
data
}