R &引用;重塑;将长格式数据分成两个序列
我有一个数据框架,其中列出了观察到的行为(“观察”)、观察到的受试者(“代码”)和观察时间(“日期”和“时间”): 从这些数据中,我想创建一个新的数据框架,在这个框架中,在一个单独的主题中,每个观察都与它后面的观察配对。对于示例数据,生成的新数据框应如下所示:R &引用;重塑;将长格式数据分成两个序列,r,reshape,R,Reshape,我有一个数据框架,其中列出了观察到的行为(“观察”)、观察到的受试者(“代码”)和观察时间(“日期”和“时间”): 从这些数据中,我想创建一个新的数据框架,在这个框架中,在一个单独的主题中,每个观察都与它后面的观察配对。对于示例数据,生成的新数据框应如下所示: code night obs.1 obs.2 A1 FALSE w f A1 FALSE f v B2 FALSE q s B2 TRUE
code night obs.1 obs.2
A1 FALSE w f
A1 FALSE f v
B2 FALSE q s
B2 TRUE s a
B2 FALSE a g
新变量“night”表示两次观测之间是否存在夜间,即第二次观测是否在第二天进行。(请注意,不是一个受试者的第一个或最后一个观察值在新数据框中出现两次,因为它既是一个观察值中的前一个观察值,也是另一系列两个观察值中的后一个观察值。)
我想写一个循环,逐行遍历原始数据帧,然后查看下一行,比较“代码”和“日期”,然后在代码相同时创建新行,并在“日期”更改时将“夜间”设置为“真”。类似于下面示例数据中的代码
有没有比循环浏览数据更好的方法?
例如,是否可以使用类似于重塑()
的方法来实现这一点
样本数据
dat您可以尝试使用dplyr
library(dplyr)
dat$day<-as.numeric(as.character(dat$day)) #to turn into numeric
dat$time<-as.POSIXct(dat$time) #dplyr can't work with POSIXlt
dat%>%
group_by(code)%>%
rename(obs.1=observation)%>%
mutate(obs.2=lead(obs.1),night=lead(day)>day)%>%
filter(!is.na(obs.2))%>%
select(code,night,obs.1,obs.2)
库(dplyr)
dat$day这里有一个非常直接的选项,使用data.table
library(data.table)
setDT(df)[,
.(
night = diff(day) == 1,
obs.1 = head(observation, -1),
obs.2 = tail(observation, -1)
),
by = code]
# code night obs.1 obs.2
# 1: A1 FALSE w f
# 2: A1 FALSE f v
# 3: B2 FALSE q s
# 4: B2 TRUE s a
# 5: B2 FALSE a g
更新:我想出了如何通过重塑来实现这一点,因此我取代了以前的部分解决方案
要回答您的问题,可以通过重塑()实现这一点。请注意,我指的是stats::reformate()函数
dat$night <- unlist(by(dat,
dat$code,
FUN=function(x) c((x[2:nrow(x), 2] - x[1:(nrow(x)-1), 2])==1, FALSE)))
dat$id.1 <- unlist(by(dat,
dat$code,
FUN=function(x) c(rep(1:nrow(x), each=2))[1:nrow(x)]))
dat$id.2 <- unlist(by(dat,
dat$code,
FUN=function(x) c(0, rep(1:nrow(x), each=2))[1:nrow(x)]))
dat$visit.1 <- unlist(by(dat,
dat$code,
FUN=function(x) rep(c(1,2), nrow(x))[1:nrow(x)]))
dat$visit.2 <- unlist(by(dat,
dat$code,
FUN=function(x) c(0, rep(c(1,2), nrow(x)))[1:nrow(x)]))
dat
rows1 <- na.omit(reshape(dat,
timevar = "visit.1",
idvar = c("code", "id.1"),
direction = "wide",
v.names = "observation",
drop = c("day", "visit.2")))
rows2 <- na.omit(reshape(dat[dat$visit.2 != 0,],
timevar = "visit.2",
idvar = c("code", "id.2"),
direction = "wide",
v.names = "observation",
drop = c("day", "visit.1")))
dat.pairs <- rbind(rows1, rows2)
dat.pairs[order(dat.pairs$code, dat.pairs$time), c("code", "night", "observation.1", "observation.2")]
code night observation.1 observation.2
1 A1 FALSE w f
3 A1 FALSE f v
5 B2 FALSE q s
6 B2 TRUE s a
2 B2 FALSE a g
Source: local data frame [5 x 4]
Groups: code [2]
code night obs.1 obs.2
<fctr> <lgl> <chr> <chr>
1 A1 FALSE w f
2 A1 FALSE f v
3 B2 FALSE q s
4 B2 TRUE s a
5 B2 FALSE a g
library(data.table)
setDT(df)[,
.(
night = diff(day) == 1,
obs.1 = head(observation, -1),
obs.2 = tail(observation, -1)
),
by = code]
# code night obs.1 obs.2
# 1: A1 FALSE w f
# 2: A1 FALSE f v
# 3: B2 FALSE q s
# 4: B2 TRUE s a
# 5: B2 FALSE a g
dat$night <- unlist(by(dat,
dat$code,
FUN=function(x) c((x[2:nrow(x), 2] - x[1:(nrow(x)-1), 2])==1, FALSE)))
dat$id.1 <- unlist(by(dat,
dat$code,
FUN=function(x) c(rep(1:nrow(x), each=2))[1:nrow(x)]))
dat$id.2 <- unlist(by(dat,
dat$code,
FUN=function(x) c(0, rep(1:nrow(x), each=2))[1:nrow(x)]))
dat$visit.1 <- unlist(by(dat,
dat$code,
FUN=function(x) rep(c(1,2), nrow(x))[1:nrow(x)]))
dat$visit.2 <- unlist(by(dat,
dat$code,
FUN=function(x) c(0, rep(c(1,2), nrow(x)))[1:nrow(x)]))
dat
rows1 <- na.omit(reshape(dat,
timevar = "visit.1",
idvar = c("code", "id.1"),
direction = "wide",
v.names = "observation",
drop = c("day", "visit.2")))
rows2 <- na.omit(reshape(dat[dat$visit.2 != 0,],
timevar = "visit.2",
idvar = c("code", "id.2"),
direction = "wide",
v.names = "observation",
drop = c("day", "visit.1")))
dat.pairs <- rbind(rows1, rows2)
dat.pairs[order(dat.pairs$code, dat.pairs$time), c("code", "night", "observation.1", "observation.2")]
code night observation.1 observation.2
1 A1 FALSE w f
3 A1 FALSE f v
5 B2 FALSE q s
6 B2 TRUE s a
2 B2 FALSE a g
dat$day <- as.numeric(as.character(dat$day))
dat$night <- unlist(by(dat,
dat$code,
FUN=function(x) c((x[2:nrow(x), 2] - x[1:(nrow(x)-1), 2])==1, FALSE)))
dat$obs.1 <- dat$observation
dat$obs.2 <- unlist(by(dat,
dat$code,
FUN=function(x) c(x[2:nrow(x), 4], NA)))
dat.pairs <- dat[!is.na(dat$obs.2), c("code", "night", "obs.1", "obs.2")]
dat.pairs$code <- as.character(dat.pairs$code)
dat.pairs
code night obs.1 obs.2
1 A1 FALSE w f
3 A1 FALSE f v
5 B2 FALSE q s
6 B2 TRUE s a
2 B2 FALSE a g