R:矢量化函数时跳过空元素

R:矢量化函数时跳过空元素,r,vectorization,data-cleaning,R,Vectorization,Data Cleaning,嗨,我正在努力学习R中的矢量化 我有以下代码: set.seed(23) obs_num=100 Observation=seq(1,obs_num) Location_Type1=sample(1:2, obs_num, replace=T) Location_Type2=sample(1:2, obs_num, replace=T) # The above does not lead to any errors #Location_Type2=sample(1, obs_num, rep

嗨,我正在努力学习R中的矢量化

我有以下代码:

set.seed(23)
obs_num=100
Observation=seq(1,obs_num)
Location_Type1=sample(1:2, obs_num, replace=T)
Location_Type2=sample(1:2, obs_num, replace=T) 
# The above does not lead to any errors

#Location_Type2=sample(1, obs_num, replace=T) 
##Error occurs when I use this formula instead.

low_bound = runif(obs_num,0,1)
mean = runif(obs_num,10,15)
df1= data.frame(Observation,Location_Type1,Location_Type2,mean,low_bound)

Vectorized_function=function(data){
  #Create groups
  i1= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 1
  i2= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 1
  i3= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 2
  i4= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 2
  #Draw values
  data[i1, "draw_value"] <- rtruncnorm(sum(i1),a=data[i1,'low_bound'],mean = data[i1, "mean"])
  data[i2, "draw_value"] <- rtruncnorm(sum(i2),a=data[i2,'low_bound'],mean = data[i2, "mean"])
  data[i3, "draw_value"] <- rtruncnorm(sum(i3),a=data[i3,'low_bound'],mean = data[i3, "mean"])
  data[i4, "draw_value"] <- rtruncnorm(sum(i4),a=data[i4,'low_bound'],mean = data[i4, "mean"])
  data
}

getvalue = Vectorized_function(data=df1)
在这种情况下,我会得到一个错误,即:

rtruncnorm中的错误(总和(i3),a=数据[i3,“下限”],平均值=数据[i3,: 长度(a)>0不是真的

我可以看到发生了什么。基本上,不存在任何满足条件i3和i4(即总和(i3)和总和(i4)=0)的观测值。在这种情况下,下限部分(“代码中的a”)会引起问题


有人可以建议如何确保我可以在代码中处理这些情况。我希望向量化函数能够处理任何条件为空的情况。

在@akrun的评论之后,我对函数进行了如下调整:

Vectorized_function=function(data){
  #Create groups
  i1= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 1
  i2= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 1
  i3= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 2
  i4= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 2
  #Draw values
  data[i1, "draw_value"] <- try(rtruncnorm(sum(i1),a=data[i1,'low_bound'],mean = data[i1, "mean"]),silent = T)
  data[i2, "draw_value"] <- try(rtruncnorm(sum(i2),a=data[i2,'low_bound'],mean = data[i2, "mean"]),silent = T)
  data[i3, "draw_value"] <- try(rtruncnorm(sum(i3),a=data[i3,'low_bound'],mean = data[i3, "mean"]),silent = T)
  data[i4, "draw_value"] <- try(rtruncnorm(sum(i4),a=data[i4,'low_bound'],mean = data[i4, "mean"]),silent = T)
  data
}
矢量化函数=函数(数据){
#创建组
i1=数据[[“位置类型1”]==1和数据[[“位置类型2”]==1
i2=数据[[“位置类型1”]==2和数据[[“位置类型2”]==1
i3=数据[[“位置类型1”]==1和数据[[“位置类型2”]==2
i4=数据[[“位置类型1”]==2和数据[[“位置类型2”]==2
#绘制值

data[i1,“draw_value”]你能用
tryCatch
换行吗?你能用
set.seed
生成返回错误的示例数据吗?我尝试过多次生成数据,但函数为我运行时没有错误。@RonakShah你是否将Location_Type2更改为Location_Type2=sample(1,obs_num,replace=T)?如果没有此更改,代码将无错误地运行。我已添加种子并对代码进行了更多解释,以供澄清。
Vectorized_function=function(data){
  #Create groups
  i1= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 1
  i2= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 1
  i3= data[["Location_Type1"]] == 1 & data[["Location_Type2"]] == 2
  i4= data[["Location_Type1"]] == 2 & data[["Location_Type2"]] == 2
  #Draw values
  data[i1, "draw_value"] <- try(rtruncnorm(sum(i1),a=data[i1,'low_bound'],mean = data[i1, "mean"]),silent = T)
  data[i2, "draw_value"] <- try(rtruncnorm(sum(i2),a=data[i2,'low_bound'],mean = data[i2, "mean"]),silent = T)
  data[i3, "draw_value"] <- try(rtruncnorm(sum(i3),a=data[i3,'low_bound'],mean = data[i3, "mean"]),silent = T)
  data[i4, "draw_value"] <- try(rtruncnorm(sum(i4),a=data[i4,'low_bound'],mean = data[i4, "mean"]),silent = T)
  data
}