Dataframe 如果下一个观测值相差1年,则从某些列移动行

Dataframe 如果下一个观测值相差1年,则从某些列移动行,dataframe,datatable,Dataframe,Datatable,我得到了以下结果: Name Year [Columns which rows should not be moved] V2 C2 KeyC A 2001 ... 4 7 NA A 2002 ... 2 0.5 1 A 2003 ...

我得到了以下结果:

Name   Year  [Columns which rows should not be moved]  V2  C2   KeyC
A      2001       ...                                   4   7    NA
A      2002       ...                                   2   0.5   1
A      2003       ...                                   4   0.2   0
A      2005       ...                                   3   0.3   NA
B      2004       ...                                   0   0.4   NA
B      2006       ...                                   1   7     NA
B      2007       ...                                   2   0.6   1
C      2002       ....                                  4     4    NA
我现在想做的是,如果下一行是当前年份行的未来一年,我只想将
V2
C2
列中的观察值移动一行

在本例中:将值从第1行移动到第2行。因此,覆盖第2行中的值。第4行保留了
V2
C2
的值,因为没有2004。对于B:第7行中的观察值获得第6行中的值,当列名中有一个新字母开始时,第7行中的值消失。每封信都要这样做

Name   Year  [Columns which rows should not be moved]  V2  C2    KeyC
A      2001       ...                                   4    7     NA
A      2002       ...                                   4    7      1
A      2003       ...                                   2   0.5     0
A      2005       ...                                   3   0.3    NA
B      2004       ...                                   0   0.4    NA
B      2006       ...                                   1     7    NA
B      2007       ...                                   1     7     1
C      2002       ....                                  4     4    NA
有办法做到这一点吗?:)


谢谢:)

您可以使用
数据中的
shift
功能。表
软件包:

dt <- read.table(text = "name   year  x  V2  C2   KeyC
A      2001       ...                                   4   7    NA
A      2002       ...                                   2   0.5   1
A      2003       ...                                   4   0.2   0
A      2005       ...                                   3   0.3   NA
B      2004       ...                                   0   0.4   NA
B      2006       ...                                   1   7     NA
B      2007       ...                                   2   0.6   1
C      2002       ....                                  4     4    NA",
header = T)

library(data.table)
dt <- data.table(dt)
dt[, `:=` (previous.year = shift(year),
           previous.V2 = shift(V2),
           previous.C2 = shift(C2))]
dt[, has.previous.year := year - 1 == previous.year]
dt[has.previous.year == TRUE, 
   `:=` (V2 = previous.V2, 
         C2 = previous.C2)]
dt <- dt[, .(name, year, x, V2, C2, KeyC)]
dt

dt我们可以为信号移位建立一个辅助键

#library(data.table)
dt=data.table(dt)
dt[, KEY:=c(0L,diff(year)), by=name]

dt[dt$KEY==1,c('V2','C2')]=data.table(apply(dt[,c('V2','C2')],2,shift)[dt$KEY==1,])
dt
name year    x V2  C2 KeyC KEY
1:    A 2001  ...  4 7.0   NA   0
2:    A 2002  ...  4 7.0    1   1
3:    A 2003  ...  2 0.5    0   1
4:    A 2005  ...  3 0.3   NA   2
5:    B 2004  ...  0 0.4   NA   0
6:    B 2006  ...  1 7.0   NA   2
7:    B 2007  ...  1 7.0    1   1
8:    C 2002 ....  4 4.0   NA   0 

假设KeyC列准确编码了所有要复制的案例:

#make helper rows that are offset by 1
df$V2_help<-c(NA, df$V2[1:nrow(df)-1])
df$C2_help<-c(NA, df$C2[1:nrow(df)-1])

#use ifelse statement to replace data where KeyC is not NA
df$V2<-ifelse(!is.na(df$KeyC), df$V2_help, df$V2)
df$C2<-ifelse(!is.na(df$KeyC), df$C2_help, df$C2)

#remove helper columns
df<-df[,setdiff(colnames(df), c("V2_help", "C2_help"))]

 Name Year V2  C2 KeyC
1    A 2001  4 7.0   NA
2    A 2002  4 7.0    1
3    A 2003  2 0.5    0
4    A 2005  3 0.3   NA
5    B 2004  0 0.4   NA
6    B 2006  1 7.0   NA
7    B 2007  1 7.0    1
8    C 2002  4 4.0   NA
#生成偏移1的辅助行
df$V2_帮助使用

library(tidyverse)
dt%>%
  group_by(name)%>%
  mutate_at(vars(C2,V2),funs(ifelse(c(0,diff(year))==1,lag(.),.)))
# A tibble: 8 x 6
# Groups:   name [3]
  name   year x        V2    C2  KeyC
  <fct> <int> <fct> <int> <dbl> <int>
1 A      2001 ...       4 7.00     NA
2 A      2002 ...       4 7.00      1
3 A      2003 ...       2 0.500     0
4 A      2005 ...       3 0.300    NA
5 B      2004 ...       0 0.400    NA
6 B      2006 ...       1 7.00     NA
7 B      2007 ...       1 7.00      1
8 C      2002 ....      4 4.00     NA
library(data.table)
setDT(dt)[,c("C2","V2") := lapply(.SD,function(x)ifelse(c(0,diff(year))==1,shift(x),x)),by=name, .SDcols=c("C2","V2")]
dt
   name year    x V2  C2 KeyC
1:    A 2001  ...  4 7.0   NA
2:    A 2002  ...  4 7.0    1
3:    A 2003  ...  2 0.5    0
4:    A 2005  ...  3 0.3   NA
5:    B 2004  ...  0 0.4   NA
6:    B 2006  ...  1 7.0   NA
7:    B 2007  ...  1 7.0    1
8:    C 2002 ....  4 4.0   NA