使用R,如何扫描区间值并打印出其他数据集的子数据?
数据集1:使用R,如何扫描区间值并打印出其他数据集的子数据?,r,dataframe,character,R,Dataframe,Character,数据集1: dput(kk) structure(list(V1 = c(1.05, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1.06, NA, NA, NA, NA, NA, NA, NA), V2 = c(NA, NA, 105.11, 105.12, 105.13, 105.14, 105.15, NA, 105.94, 105.99, NA, NA, 106.11, 106.12, 106.13, 106.14, 106.19, 106.2)),
dput(kk)
structure(list(V1 = c(1.05, NA, NA, NA, NA, NA, NA, NA, NA,
NA,
1.06, NA, NA, NA, NA, NA, NA, NA), V2 = c(NA, NA, 105.11, 105.12,
105.13, 105.14, 105.15, NA, 105.94, 105.99, NA, NA, 106.11, 106.12,
106.13, 106.14, 106.19, 106.2)), .Names = c("V1", "V2"), class = "data.frame", row.names = c(NA,
-18L))
show(kk)
V1 V2
1 1.05 NA
2 NA NA
3 NA 105.11
4 NA 105.12
5 NA 105.13
6 NA 105.14
7 NA 105.15
8 NA NA
9 NA 105.94
10 NA 105.99
11 1.06 NA
12 NA NA
13 NA 106.11
14 NA 106.12
15 NA 106.13
16 NA 106.14
17 NA 106.19
18 NA 106.20
数据集2:
structure(list(V1 = structure(1:4, .Label = c("1.05 ~ 1.06", "1.07",
"1.08", "1.09 ~ 1.10"), class = "factor")), .Names = "V1", class =
"data.frame", row.names = c(NA,
-4L))
V1
1 1.05 ~ 1.06
2 1.07
3 1.08
4 1.09 ~ 1.10
我如何扫描数据集2中V1的间隔值并打印出数据集1的子类别数据,该数据集覆盖了新数据集上的间隔,如上文所述?如果我理解正确,您需要的是以下内容:
lapply(df2$V1, function(x) {
z <- as.numeric(unlist(strsplit(as.character(x), split = " ~ ")))
b <- which(df1$V1 %in% z)
if(length(b)==0) return(NULL)
if(length(b)==1) return(df1[b,])
if(length(b)==2) return(df1[b[1]:b[2],])
})
#result
[[1]]
V1 V2
1 1.05 NA
2 NA NA
3 NA 105.11
4 NA 105.12
5 NA 105.13
6 NA 105.14
7 NA 105.15
8 NA NA
9 NA 105.94
10 NA 105.99
11 1.06 NA
[[2]]
NULL
[[3]]
NULL
[[4]]
NULL
您逐个检查df2$V1
的元素,并对每个元素应用一个函数
函数首先在“~”处拆分字符串,然后在取消列出后转换为数值,因为strsplit
返回的是列表而不是向量
z <- as.numeric(unlist(strsplit(as.character(x), split = " ~ ")))
如果b
具有0
元素,则返回NULL
如果
b
有1
元素,它只返回一行df1$V1
如果
b
具有2
元素,它将返回一个从b[1]
到b[2]
的范围或行,您可能应该更明确地了解预期的输出。
z <- as.numeric(unlist(strsplit(as.character(x), split = " ~ ")))
b <- which(df1$V1 %in% z)