R 将空白单元格插入每个“中”;X";同一列中的第th行

R 将空白单元格插入每个“中”;X";同一列中的第th行,r,R,在excel文件中,我有如下表格,其标题如下: **Date** **Session** **Player** **Pre** **Post** **Distance(m)** Jan 1 1 Player 1 3 6 1000 Jan 1 1 Player 2 3 7 1500 Jan 1 1 Player 3 4

在excel文件中,我有如下表格,其标题如下:

**Date** **Session** **Player** **Pre** **Post** **Distance(m)**
Jan 1         1        Player 1    3        6          1000
Jan 1         1        Player 2    3        7          1500
Jan 1         1        Player 3    4        10         4000
Jan 1         1        Player 4    1        3          600
Jan 2         2        Player 1    2        5          1000
Jan 2         2        Player 2    -        -          1750
Jan 2         2        Player 3    5        5          3000
Jan 2         2        Player 4    3        6          1000
Jan 3         3        Player 1    3        5          2500   
Jan 3         3        Player 2    3        8          1500
Jan 3         3        Player 3    7        7          2500
Jan 3         3        Player 4    -        -            -
我试图完成的是查看距离数字,并将其与下一节课的预数字进行比较。因此,在玩家1的第1节中,他们的距离(1000)和从1月2日(2)开始的赛前距离应该在同一排

为此,在按会话编号对玩家进行排序后,我试图找到一种方法,在距离列中为每个玩家插入一个空单元格,作为会话0的占位符。这基本上会缩短距离,以与第二天的赛前比赛相匹配

因此,在对该数据集执行该操作后,结果如下所示:

**Player**  **Pre for the following Day**             **Distance**
Player 1      3 (S1)                                    -  (Session 0 - Does Not Exist) (This value is inserted)               
Player 1      2 (S2)                                   1000(Session 1)                              
Player 1      3 (S3)                                   1000(Session 2)                                
Player 1      - (S4 - Not included in this example)    2500(Session 3)                       
Player 2      3 (S1)                                   -   (S0)                                      
Player 2      - (S2)                                   1500(S1)                                      
Player 2      3 (S3)                                   1750(S2)                                       
Player 2      - (S4)                                   1500(S3)                                       
Player 3      4 (S1)                                   -   (S0)                                       
Player 3      5 (S2)                                   4000(S1)                                       
Player 3      7 (S3)                                   3000(S2)                                       
Player 3      - (S4)                                   2500(S3)                                       
播放器4因时间/冗余而被遗漏

在此示例中,会话3是最后一个会话,因此默认情况下,所有玩家的S4 Pre也将作为
-
插入

因此,需要每4行插入一个
-
,以匹配每个距离和正确的玩家,并且在上一个会话之后,为每个玩家创建一个新行,给出前和后的
-
,以及正确的距离

在尝试执行此操作时,我有以下代码和数据集: 来自dput()

我的代码是:

test1 <- data.frame("2020-01-01",1,"Player 1",3,6, "-")
test2 <- data.frame("2020-01-01",4,"Player 1","-","-","2500")
names(test1) <- c("Date", "Session", "Player", "Pre", "Post", "Distance")
names(test2) <- c("Date", "Session", "Player", "Pre", "Post", "Distance")
new <- rbind(test1, stackEX) #This puts the new row at the top where I want it
                             #Not sure why this removes dates for other rows though
new <- rbind(new, test2)#This is for Session 4 which does not exist in this example

test1这可以通过加入一套完整的
播放器
/
会话
组合和移动
距离来解决:

library(data.table)
setDT(DF)[CJ(Player, Session = 1:4, unique = TRUE), on = .(Player, Session)][
  , Distance := shift(Distance)][]

          Date Session   Player  Pre Post Distance
 1: 2020-01-01       1 Player 1    3    6     <NA>
 2: 2020-01-02       2 Player 1    2    5     1000
 3: 2020-01-03       3 Player 1    3    5     1000
 4:       <NA>       4 Player 1 <NA> <NA>     2500
 5: 2020-01-01       1 Player 2    3    7     <NA>
 6: 2020-01-02       2 Player 2    -    -     1500
 7: 2020-01-03       3 Player 2    3    8     1750
 8:       <NA>       4 Player 2 <NA> <NA>     1500
 9: 2020-01-01       1 Player 3    4   10     <NA>
10: 2020-01-02       2 Player 3    5    5     4000
11: 2020-01-03       3 Player 3    7    7     3000
12:       <NA>       4 Player 3 <NA> <NA>     2500
13: 2020-01-01       1 Player 4    1    3     <NA>
14: 2020-01-02       2 Player 4    3    6      600
15: 2020-01-03       3 Player 4    -    -     1000
16:       <NA>       4 Player 4 <NA> <NA>        -
返回所有
播放器
/
会话
组合:

shift()
的默认参数在这里就足够了:
shift(Distance)
滞后
Distance
一个,并且
NA
用于填充,即
Distance
列中的值向下移动到下一行。因此
player1
的第4行(会话4)根据请求获取前一行(会话3)的
距离值。顶部的空行变为
NA
。另请参见
帮助(“shift”,“data.table”)

注意,我们不需要在这里分组,因为整个列都是滞后的

资料
DF在用我的代码更新你的代码后,我确实有两个关于你的问题。第一,在这部分
CJ(播放器,会话=1:84,unique=TRUE)
,unique是指什么?它是唯一的会话吗?因为任何人都不应该有重复的会话,但在我的代码中,有时在同一天有多个会话,但由于在代码中的任何地方都没有调用Date,所以我对此感到困惑。第二个问题:
Distance:=shift(Distance)
这怎么知道将所有内容向下移动1?因为距离不等于1,所以你能解释一下吗?没关系,我意识到你已经解释了交叉连接表达式的作用,所以它是玩家和会话之间的独特组合。。。但仍然不了解距离的移动部分。谢谢@samrizz4,我已经添加了更多关于
shift()
函数的解释。这让我更加清楚了一切-再次感谢您的帮助。
library(data.table)
setDT(DF)[CJ(Player, Session = 1:4, unique = TRUE), on = .(Player, Session)][
  , Distance := shift(Distance)][]

          Date Session   Player  Pre Post Distance
 1: 2020-01-01       1 Player 1    3    6     <NA>
 2: 2020-01-02       2 Player 1    2    5     1000
 3: 2020-01-03       3 Player 1    3    5     1000
 4:       <NA>       4 Player 1 <NA> <NA>     2500
 5: 2020-01-01       1 Player 2    3    7     <NA>
 6: 2020-01-02       2 Player 2    -    -     1500
 7: 2020-01-03       3 Player 2    3    8     1750
 8:       <NA>       4 Player 2 <NA> <NA>     1500
 9: 2020-01-01       1 Player 3    4   10     <NA>
10: 2020-01-02       2 Player 3    5    5     4000
11: 2020-01-03       3 Player 3    7    7     3000
12:       <NA>       4 Player 3 <NA> <NA>     2500
13: 2020-01-01       1 Player 4    1    3     <NA>
14: 2020-01-02       2 Player 4    3    6      600
15: 2020-01-03       3 Player 4    -    -     1000
16:       <NA>       4 Player 4 <NA> <NA>        -
CJ(Player, Session = 1:4, unique = TRUE)
      Player Session
 1: Player 1       1
 2: Player 1       2
 3: Player 1       3
 4: Player 1       4
 5: Player 2       1
 6: Player 2       2
 7: Player 2       3
 8: Player 2       4
 9: Player 3       1
10: Player 3       2
11: Player 3       3
12: Player 3       4
13: Player 4       1
14: Player 4       2
15: Player 4       3
16: Player 4       4
DF <- structure(list(Date = structure(c(1577836800, 1577836800, 1577836800, 
1577836800, 1577923200, 1577923200, 1577923200, 1577923200, 1578009600, 
1578009600, 1578009600, 1578009600), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), Session = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 
3, 3), Player = c("Player 1", "Player 2", "Player 3", "Player 4", 
"Player 1", "Player 2", "Player 3", "Player 4", "Player 1", "Player 2", 
"Player 3", "Player 4"), Pre = c("3", "3", "4", "1", "2", "-", 
"5", "3", "3", "3", "7", "-"), Post = c("6", "7", "10", "3", 
"5", "-", "5", "6", "5", "8", "7", "-"), Distance = c("1000", 
"1500", "4000", "600", "1000", "1750", "3000", "1000", "2500", 
"1500", "2500", "-")), row.names = c(NA, 12L), class = "data.frame")