Dataframe 如果下一个观测值相差1年,则从某些列移动行
我得到了以下结果:Dataframe 如果下一个观测值相差1年,则从某些列移动行,dataframe,datatable,Dataframe,Datatable,我得到了以下结果: Name Year [Columns which rows should not be moved] V2 C2 KeyC A 2001 ... 4 7 NA A 2002 ... 2 0.5 1 A 2003 ...
Name Year [Columns which rows should not be moved] V2 C2 KeyC
A 2001 ... 4 7 NA
A 2002 ... 2 0.5 1
A 2003 ... 4 0.2 0
A 2005 ... 3 0.3 NA
B 2004 ... 0 0.4 NA
B 2006 ... 1 7 NA
B 2007 ... 2 0.6 1
C 2002 .... 4 4 NA
我现在想做的是,如果下一行是当前年份行的未来一年,我只想将V2
和C2
列中的观察值移动一行
在本例中:将值从第1行移动到第2行。因此,覆盖第2行中的值。第4行保留了V2
和C2
的值,因为没有2004。对于B:第7行中的观察值获得第6行中的值,当列名中有一个新字母开始时,第7行中的值消失。每封信都要这样做
Name Year [Columns which rows should not be moved] V2 C2 KeyC
A 2001 ... 4 7 NA
A 2002 ... 4 7 1
A 2003 ... 2 0.5 0
A 2005 ... 3 0.3 NA
B 2004 ... 0 0.4 NA
B 2006 ... 1 7 NA
B 2007 ... 1 7 1
C 2002 .... 4 4 NA
有办法做到这一点吗?:)
谢谢:)您可以使用
数据中的shift
功能。表
软件包:
dt <- read.table(text = "name year x V2 C2 KeyC
A 2001 ... 4 7 NA
A 2002 ... 2 0.5 1
A 2003 ... 4 0.2 0
A 2005 ... 3 0.3 NA
B 2004 ... 0 0.4 NA
B 2006 ... 1 7 NA
B 2007 ... 2 0.6 1
C 2002 .... 4 4 NA",
header = T)
library(data.table)
dt <- data.table(dt)
dt[, `:=` (previous.year = shift(year),
previous.V2 = shift(V2),
previous.C2 = shift(C2))]
dt[, has.previous.year := year - 1 == previous.year]
dt[has.previous.year == TRUE,
`:=` (V2 = previous.V2,
C2 = previous.C2)]
dt <- dt[, .(name, year, x, V2, C2, KeyC)]
dt
dt我们可以为信号移位建立一个辅助键
#library(data.table)
dt=data.table(dt)
dt[, KEY:=c(0L,diff(year)), by=name]
dt[dt$KEY==1,c('V2','C2')]=data.table(apply(dt[,c('V2','C2')],2,shift)[dt$KEY==1,])
dt
name year x V2 C2 KeyC KEY
1: A 2001 ... 4 7.0 NA 0
2: A 2002 ... 4 7.0 1 1
3: A 2003 ... 2 0.5 0 1
4: A 2005 ... 3 0.3 NA 2
5: B 2004 ... 0 0.4 NA 0
6: B 2006 ... 1 7.0 NA 2
7: B 2007 ... 1 7.0 1 1
8: C 2002 .... 4 4.0 NA 0
假设KeyC列准确编码了所有要复制的案例:
#make helper rows that are offset by 1
df$V2_help<-c(NA, df$V2[1:nrow(df)-1])
df$C2_help<-c(NA, df$C2[1:nrow(df)-1])
#use ifelse statement to replace data where KeyC is not NA
df$V2<-ifelse(!is.na(df$KeyC), df$V2_help, df$V2)
df$C2<-ifelse(!is.na(df$KeyC), df$C2_help, df$C2)
#remove helper columns
df<-df[,setdiff(colnames(df), c("V2_help", "C2_help"))]
Name Year V2 C2 KeyC
1 A 2001 4 7.0 NA
2 A 2002 4 7.0 1
3 A 2003 2 0.5 0
4 A 2005 3 0.3 NA
5 B 2004 0 0.4 NA
6 B 2006 1 7.0 NA
7 B 2007 1 7.0 1
8 C 2002 4 4.0 NA
#生成偏移1的辅助行
df$V2_帮助使用
library(tidyverse)
dt%>%
group_by(name)%>%
mutate_at(vars(C2,V2),funs(ifelse(c(0,diff(year))==1,lag(.),.)))
# A tibble: 8 x 6
# Groups: name [3]
name year x V2 C2 KeyC
<fct> <int> <fct> <int> <dbl> <int>
1 A 2001 ... 4 7.00 NA
2 A 2002 ... 4 7.00 1
3 A 2003 ... 2 0.500 0
4 A 2005 ... 3 0.300 NA
5 B 2004 ... 0 0.400 NA
6 B 2006 ... 1 7.00 NA
7 B 2007 ... 1 7.00 1
8 C 2002 .... 4 4.00 NA
library(data.table)
setDT(dt)[,c("C2","V2") := lapply(.SD,function(x)ifelse(c(0,diff(year))==1,shift(x),x)),by=name, .SDcols=c("C2","V2")]
dt
name year x V2 C2 KeyC
1: A 2001 ... 4 7.0 NA
2: A 2002 ... 4 7.0 1
3: A 2003 ... 2 0.5 0
4: A 2005 ... 3 0.3 NA
5: B 2004 ... 0 0.4 NA
6: B 2006 ... 1 7.0 NA
7: B 2007 ... 1 7.0 1
8: C 2002 .... 4 4.0 NA