R 将空白单元格插入每个“中”;X";同一列中的第th行
在excel文件中,我有如下表格,其标题如下:R 将空白单元格插入每个“中”;X";同一列中的第th行,r,R,在excel文件中,我有如下表格,其标题如下: **Date** **Session** **Player** **Pre** **Post** **Distance(m)** Jan 1 1 Player 1 3 6 1000 Jan 1 1 Player 2 3 7 1500 Jan 1 1 Player 3 4
**Date** **Session** **Player** **Pre** **Post** **Distance(m)**
Jan 1 1 Player 1 3 6 1000
Jan 1 1 Player 2 3 7 1500
Jan 1 1 Player 3 4 10 4000
Jan 1 1 Player 4 1 3 600
Jan 2 2 Player 1 2 5 1000
Jan 2 2 Player 2 - - 1750
Jan 2 2 Player 3 5 5 3000
Jan 2 2 Player 4 3 6 1000
Jan 3 3 Player 1 3 5 2500
Jan 3 3 Player 2 3 8 1500
Jan 3 3 Player 3 7 7 2500
Jan 3 3 Player 4 - - -
我试图完成的是查看距离数字,并将其与下一节课的预数字进行比较。因此,在玩家1的第1节中,他们的距离(1000)和从1月2日(2)开始的赛前距离应该在同一排
为此,在按会话编号对玩家进行排序后,我试图找到一种方法,在距离列中为每个玩家插入一个空单元格,作为会话0的占位符。这基本上会缩短距离,以与第二天的赛前比赛相匹配
因此,在对该数据集执行该操作后,结果如下所示:
**Player** **Pre for the following Day** **Distance**
Player 1 3 (S1) - (Session 0 - Does Not Exist) (This value is inserted)
Player 1 2 (S2) 1000(Session 1)
Player 1 3 (S3) 1000(Session 2)
Player 1 - (S4 - Not included in this example) 2500(Session 3)
Player 2 3 (S1) - (S0)
Player 2 - (S2) 1500(S1)
Player 2 3 (S3) 1750(S2)
Player 2 - (S4) 1500(S3)
Player 3 4 (S1) - (S0)
Player 3 5 (S2) 4000(S1)
Player 3 7 (S3) 3000(S2)
Player 3 - (S4) 2500(S3)
播放器4因时间/冗余而被遗漏
在此示例中,会话3是最后一个会话,因此默认情况下,所有玩家的S4 Pre也将作为-
插入
因此,需要每4行插入一个-
,以匹配每个距离和正确的玩家,并且在上一个会话之后,为每个玩家创建一个新行,给出前和后的-
,以及正确的距离
在尝试执行此操作时,我有以下代码和数据集:
来自dput()
我的代码是:
test1 <- data.frame("2020-01-01",1,"Player 1",3,6, "-")
test2 <- data.frame("2020-01-01",4,"Player 1","-","-","2500")
names(test1) <- c("Date", "Session", "Player", "Pre", "Post", "Distance")
names(test2) <- c("Date", "Session", "Player", "Pre", "Post", "Distance")
new <- rbind(test1, stackEX) #This puts the new row at the top where I want it
#Not sure why this removes dates for other rows though
new <- rbind(new, test2)#This is for Session 4 which does not exist in this example
test1这可以通过加入一套完整的播放器
/会话
组合和移动距离来解决:
library(data.table)
setDT(DF)[CJ(Player, Session = 1:4, unique = TRUE), on = .(Player, Session)][
, Distance := shift(Distance)][]
Date Session Player Pre Post Distance
1: 2020-01-01 1 Player 1 3 6 <NA>
2: 2020-01-02 2 Player 1 2 5 1000
3: 2020-01-03 3 Player 1 3 5 1000
4: <NA> 4 Player 1 <NA> <NA> 2500
5: 2020-01-01 1 Player 2 3 7 <NA>
6: 2020-01-02 2 Player 2 - - 1500
7: 2020-01-03 3 Player 2 3 8 1750
8: <NA> 4 Player 2 <NA> <NA> 1500
9: 2020-01-01 1 Player 3 4 10 <NA>
10: 2020-01-02 2 Player 3 5 5 4000
11: 2020-01-03 3 Player 3 7 7 3000
12: <NA> 4 Player 3 <NA> <NA> 2500
13: 2020-01-01 1 Player 4 1 3 <NA>
14: 2020-01-02 2 Player 4 3 6 600
15: 2020-01-03 3 Player 4 - - 1000
16: <NA> 4 Player 4 <NA> <NA> -
返回所有播放器
/会话
组合:
shift()
的默认参数在这里就足够了:shift(Distance)
滞后Distance
一个,并且NA
用于填充,即Distance
列中的值向下移动到下一行。因此player1
的第4行(会话4)根据请求获取前一行(会话3)的距离值。顶部的空行变为NA
。另请参见帮助(“shift”,“data.table”)
注意,我们不需要在这里分组,因为整个列都是滞后的
资料
DF在用我的代码更新你的代码后,我确实有两个关于你的问题。第一,在这部分CJ(播放器,会话=1:84,unique=TRUE)
,unique是指什么?它是唯一的会话吗?因为任何人都不应该有重复的会话,但在我的代码中,有时在同一天有多个会话,但由于在代码中的任何地方都没有调用Date,所以我对此感到困惑。第二个问题:Distance:=shift(Distance)
这怎么知道将所有内容向下移动1?因为距离不等于1,所以你能解释一下吗?没关系,我意识到你已经解释了交叉连接表达式的作用,所以它是玩家和会话之间的独特组合。。。但仍然不了解距离的移动部分。谢谢@samrizz4,我已经添加了更多关于shift()
函数的解释。这让我更加清楚了一切-再次感谢您的帮助。
library(data.table)
setDT(DF)[CJ(Player, Session = 1:4, unique = TRUE), on = .(Player, Session)][
, Distance := shift(Distance)][]
Date Session Player Pre Post Distance
1: 2020-01-01 1 Player 1 3 6 <NA>
2: 2020-01-02 2 Player 1 2 5 1000
3: 2020-01-03 3 Player 1 3 5 1000
4: <NA> 4 Player 1 <NA> <NA> 2500
5: 2020-01-01 1 Player 2 3 7 <NA>
6: 2020-01-02 2 Player 2 - - 1500
7: 2020-01-03 3 Player 2 3 8 1750
8: <NA> 4 Player 2 <NA> <NA> 1500
9: 2020-01-01 1 Player 3 4 10 <NA>
10: 2020-01-02 2 Player 3 5 5 4000
11: 2020-01-03 3 Player 3 7 7 3000
12: <NA> 4 Player 3 <NA> <NA> 2500
13: 2020-01-01 1 Player 4 1 3 <NA>
14: 2020-01-02 2 Player 4 3 6 600
15: 2020-01-03 3 Player 4 - - 1000
16: <NA> 4 Player 4 <NA> <NA> -
CJ(Player, Session = 1:4, unique = TRUE)
Player Session
1: Player 1 1
2: Player 1 2
3: Player 1 3
4: Player 1 4
5: Player 2 1
6: Player 2 2
7: Player 2 3
8: Player 2 4
9: Player 3 1
10: Player 3 2
11: Player 3 3
12: Player 3 4
13: Player 4 1
14: Player 4 2
15: Player 4 3
16: Player 4 4
DF <- structure(list(Date = structure(c(1577836800, 1577836800, 1577836800,
1577836800, 1577923200, 1577923200, 1577923200, 1577923200, 1578009600,
1578009600, 1578009600, 1578009600), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), Session = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3,
3, 3), Player = c("Player 1", "Player 2", "Player 3", "Player 4",
"Player 1", "Player 2", "Player 3", "Player 4", "Player 1", "Player 2",
"Player 3", "Player 4"), Pre = c("3", "3", "4", "1", "2", "-",
"5", "3", "3", "3", "7", "-"), Post = c("6", "7", "10", "3",
"5", "-", "5", "6", "5", "8", "7", "-"), Distance = c("1000",
"1500", "4000", "600", "1000", "1750", "3000", "1000", "2500",
"1500", "2500", "-")), row.names = c(NA, 12L), class = "data.frame")