R 选择范围可变的子集集
我必须在这个数据集(dat)上做出一组根据日期而变化的选择,该数据集由物种(sp)、日(日,在POSIXct中)和面积(ar)组成: 我需要对物种“A”出现的位置进行子集划分。但是,根据该矩阵(dat.ar),拟选择的区域将因日期而异: 更具体地说,对于物种“A”出现的区域,在00年1月1日,我只需要区域1和6。对于2000年1月2日、1区和12区,依此类推。 作为示例,本示例中该选择的期望输出如下所示:R 选择范围可变的子集集,r,for-loop,subset,sapply,R,For Loop,Subset,Sapply,我必须在这个数据集(dat)上做出一组根据日期而变化的选择,该数据集由物种(sp)、日(日,在POSIXct中)和面积(ar)组成: 我需要对物种“A”出现的位置进行子集划分。但是,根据该矩阵(dat.ar),拟选择的区域将因日期而异: 更具体地说,对于物种“A”出现的区域,在00年1月1日,我只需要区域1和6。对于2000年1月2日、1区和12区,依此类推。 作为示例,本示例中该选择的期望输出如下所示: sp day ar A 2-Jan-00 1 A 4-Jan
sp day ar
A 2-Jan-00 1
A 4-Jan-00 3
我还没有成功地获得for循环,因为我仍在努力学习R中的语义。总之,对于必须做什么,我有一个大致的想法,但仍在努力学习该语言。下面是我认为应该做的事情的草图:
dat1 = with(dat,sapply(day[sp=="A" & dat.ar$day.s[i] ],
function(x) ar == (ar[sp=="A" & day == x]==dat.ar$ar.select[j])
final=dat[rowSums(dat1) > 0, ]
我相信我必须适应一个for循环,它将通过dat.ar,指定要在dat中选择的区域。但是,尽管我努力尝试进入for循环,我还是没有接近。我甚至不确定将sappy和for循环结合起来是否是正确的方法。
如果有人希望重现问题:
sp=c("A","B","C","A","D","E","F","A","G","B")
day=c("1-Jan-00", "1-Jan-00", "2-Jan-00", "2-Jan-00", "2-Jan-00",
"2-Jan-00", "3-Jan-00", "4-Jan-00", "4-Jan-00", "4-Jan-00")
day=as.POSIXct(day, format="%d-%b-%y")
ar=c(2,6,2,1,4,12,8,3,2,1)
dat= as.data.frame(cbind(sp, day, ar))
day.s=c("1-Jan-00", "2-Jan-00", "3-Jan-00", "4-jan-00")
day.s=as.POSIXct(day.s, format="%d-%b-%y")
a.s=c(1,1,4,3)
a.e=c(6,12,8,12)
ar.select=paste(a.s, a.e, sep=",")
dat.ar=cbind(day.s, ar.select)
非常感谢您的帮助。您可以将条件表合并到原始数据集中,并有条件地进行筛选。考虑A1和A2,像你的SP和日值,OBS就像你的AR值。< /P>
library(data.table)
dataset <- data.table(
a1 = c("A","B","C","B","A","A","A","A"),
a2 = c("P","Q","Q","Q","R","R","P","Q"),
obs = c(3,2,3,4,2,4,8,0)
)
constraints <- data.table(
a1 = c("A","B","C","A","B","C","A","B","C"),
a2 = c("P","P","P","Q","Q","Q","R","R","R"),
lower = c(1,2,3,4,3,2,3,2,5),
upper = c(6,4,5,7,5,6,5,3,7)
)
checkingdataset <- merge(dataset,constraints, by = c("a1","a2"), all.x = TRUE)
checkingdataset[obs <= upper & obs >= lower, obs.keep := TRUE]
# a1 a2 obs lower upper obs.keep
#1: A P 3 1 6 TRUE
#2: A P 8 1 6 NA
#3: A Q 0 4 7 NA
#4: A R 2 3 5 NA
#5: A R 4 3 5 TRUE
#6: B Q 2 3 5 NA
#7: B Q 4 3 5 TRUE
#8: C Q 3 2 6 TRUE
库(data.table)
dataset您可以将条件表合并到原始数据集,并有条件地对其进行筛选。考虑A1和A2,像你的SP和日值,OBS就像你的AR值。< /P>
library(data.table)
dataset <- data.table(
a1 = c("A","B","C","B","A","A","A","A"),
a2 = c("P","Q","Q","Q","R","R","P","Q"),
obs = c(3,2,3,4,2,4,8,0)
)
constraints <- data.table(
a1 = c("A","B","C","A","B","C","A","B","C"),
a2 = c("P","P","P","Q","Q","Q","R","R","R"),
lower = c(1,2,3,4,3,2,3,2,5),
upper = c(6,4,5,7,5,6,5,3,7)
)
checkingdataset <- merge(dataset,constraints, by = c("a1","a2"), all.x = TRUE)
checkingdataset[obs <= upper & obs >= lower, obs.keep := TRUE]
# a1 a2 obs lower upper obs.keep
#1: A P 3 1 6 TRUE
#2: A P 8 1 6 NA
#3: A Q 0 4 7 NA
#4: A R 2 3 5 NA
#5: A R 4 3 5 TRUE
#6: B Q 2 3 5 NA
#7: B Q 4 3 5 TRUE
#8: C Q 3 2 6 TRUE
库(data.table)
dataset首先,我不会使用as.data.frame(cbind(…)
来创建您的data.frame
s。第二,我将创建dat.ar
,其结构与您创建的dat
基本相同。第三,我只需要使用merge
就可以得到您想要的结果
dat <- data.frame(sp=c("A","B","C","A","D","E","F","A","G","B"),
day=c("1-Jan-00", "1-Jan-00", "2-Jan-00", "2-Jan-00",
"2-Jan-00", "2-Jan-00", "3-Jan-00", "4-Jan-00",
"4-Jan-00", "4-Jan-00"),
ar=c(2,6,2,1,4,12,8,3,2,1))
dat$day <- as.POSIXct(dat$day, format="%d-%b-%y")
day.s <- c("1-Jan-00", "2-Jan-00", "3-Jan-00", "4-jan-00")
day.s <- as.POSIXct(day.s, format="%d-%b-%y")
a.s <- c(1,1,4,3)
a.e <- c(6,12,8,12)
ar.select <- paste(a.s, a.e, sep=",")
dat.ar <- data.frame(sp = "A", day = day.s, ar = ar.select)
dat.ar <- cbind(dat.ar[-3],
read.csv(text = as.character(dat.ar$ar), header = FALSE))
library(reshape2)
dat.ar <- melt(dat.ar, id.vars=1:2, value.name="ar")
dat.ar
# sp day variable ar
# 1 A 2000-01-01 V1 1
# 2 A 2000-01-02 V1 1
# 3 A 2000-01-03 V1 4
# 4 A 2000-01-04 V1 3
# 5 A 2000-01-01 V2 6
# 6 A 2000-01-02 V2 12
# 7 A 2000-01-03 V2 8
# 8 A 2000-01-04 V2 12
merge(dat, dat.ar)
# sp day ar variable
# 1 A 2000-01-02 1 V1
# 2 A 2000-01-04 3 V1
首先,我不会使用as.data.frame(cbind(…)
来创建data.frame
s。第二,我将创建dat.ar
,其结构与您创建的dat
基本相同。第三,我只需要使用merge
就可以得到您想要的结果
dat <- data.frame(sp=c("A","B","C","A","D","E","F","A","G","B"),
day=c("1-Jan-00", "1-Jan-00", "2-Jan-00", "2-Jan-00",
"2-Jan-00", "2-Jan-00", "3-Jan-00", "4-Jan-00",
"4-Jan-00", "4-Jan-00"),
ar=c(2,6,2,1,4,12,8,3,2,1))
dat$day <- as.POSIXct(dat$day, format="%d-%b-%y")
day.s <- c("1-Jan-00", "2-Jan-00", "3-Jan-00", "4-jan-00")
day.s <- as.POSIXct(day.s, format="%d-%b-%y")
a.s <- c(1,1,4,3)
a.e <- c(6,12,8,12)
ar.select <- paste(a.s, a.e, sep=",")
dat.ar <- data.frame(sp = "A", day = day.s, ar = ar.select)
dat.ar <- cbind(dat.ar[-3],
read.csv(text = as.character(dat.ar$ar), header = FALSE))
library(reshape2)
dat.ar <- melt(dat.ar, id.vars=1:2, value.name="ar")
dat.ar
# sp day variable ar
# 1 A 2000-01-01 V1 1
# 2 A 2000-01-02 V1 1
# 3 A 2000-01-03 V1 4
# 4 A 2000-01-04 V1 3
# 5 A 2000-01-01 V2 6
# 6 A 2000-01-02 V2 12
# 7 A 2000-01-03 V2 8
# 8 A 2000-01-04 V2 12
merge(dat, dat.ar)
# sp day ar variable
# 1 A 2000-01-02 1 V1
# 2 A 2000-01-04 3 V1
dat.ar <- data.frame(sp = "A",
day = c("1-Jan-00", "2-Jan-00", "3-Jan-00", "4-jan-00"),
a.s = c(1,1,4,3), a.e = c(6,12,8,12))
dat.ar$day <- as.POSIXct(dat.ar$day, format="%d-%b-%y")
library(reshape2)
dat.ar <- melt(dat.ar, id.vars=1:2, value.name="ar")