R 如何统计data.table中的访问次数
我需要在R studio中的datatable中创建一个新列,该列通过“访问”对我的数据进行分类 以下是一个示例数据表:R 如何统计data.table中的访问次数,r,data.table,R,Data.table,我需要在R studio中的datatable中创建一个新列,该列通过“访问”对我的数据进行分类 以下是一个示例数据表: library(data.table) reprex_1 = data.table( `Receiver Number`=c("Receiver A", "Receiver B", "Receiver B","Receiver B","Receiver B", "
library(data.table)
reprex_1 = data.table(
`Receiver Number`=c("Receiver A", "Receiver B", "Receiver B","Receiver B","Receiver B", "Receiver B", "Receiver B","Receiver C", "Receiver C", "Receiver C"),
Transmitter = c("Tag 1", "Tag 2" , "Tag 3" , "Tag 3", "Tag 3" , "Tag 3" , "Tag 3","Tag 4" ,"Tag 4", "Tag 4"),
`Station Name` = c("Station A","Station B","Station B","Station B","Station B", "Station B","Station B","Station C","Station C","Station C"),
TimeDiff = c( NA,NA,NA,221536,1114, 425,10728,110131,61,43)
)
Receiver Number Transmitter Station Name TimeDiff
Receiver A Tag 1 Station A NA
Receiver B Tag 2 Station B NA
Receiver B Tag 3 Station B NA
Receiver B Tag 3 Station B 221536
Receiver B Tag 3 Station B 1114
Receiver B Tag 3 Station B 425
Receiver B Tag 3 Station B 10728
Receiver C Tag 4 Station C 110131
Receiver C Tag 4 Station C 61
Receiver C Tag 4 Station C 43
我需要创建一个新的访问列,每个访问按接收器编号、发射器、电台名称分组,TimeDiff 1800或NA也构成一个新的访问。我想在连续编号(1,2,3…)
以下是我想要的:
Receiver Number Transmitter Station Name TimeDiff Visit
Receiver A Tag 1 Station A NA 1
Receiver B Tag 2 Station B NA 2
Receiver B Tag 3 Station B NA 3
Receiver B Tag 3 Station B 221536 4
Receiver B Tag 3 Station B 1114 4
Receiver B Tag 3 Station B 425 4
Receiver B Tag 3 Station B 10728 5
Receiver C Tag 4 Station C 110131 6
Receiver C Tag 4 Station C 61 6
Receiver C Tag 4 Station C 43 6
我已经看过其他根据分组数据对行进行分类的示例,并且可以让R根据前三列(接收器编号、发射器和电台名称)的唯一组合创建访问,但我不知道如何将TimeDiff>1800的条件包括在内,以同时启用新的访问
这是我能做到的,但不包括在TimedDiff>1800之前创建新访问:
require(data.table)
setDT(reprex_1)[,AttemptVisit:=.GRP, by = c("Receiver Number","Station Name", "Transmitter")]
Receiver Number Transmitter Station Name TimeDiff AttemptVisit
Receiver A Tag 1 Station A NA 1
Receiver B Tag 2 Station B NA 2
Receiver B Tag 3 Station B NA 3
Receiver B Tag 3 Station B 221536 3
Receiver B Tag 3 Station B 1114 3
Receiver B Tag 3 Station B 425 3
Receiver B Tag 3 Station B 10728 3
Receiver C Tag 4 Station C 110131 4
Receiver C Tag 4 Station C 61 4
Receiver C Tag 4 Station C 43 4
如果您能提供任何帮助,我将不胜感激 我认为这应该行得通。我们使用
NA
或>1800个值的cumsum
作为grouper的一部分:
reprex_1[, visit := .GRP,
by = .(`Receiver Number`, Transmitter, `Station Name`, cumsum(TimeDiff > 1800 | is.na(TimeDiff)))]
# reprex_1
# Receiver Number Transmitter Station Name TimeDiff visit
# 1: Receiver A Tag 1 Station A NA 1
# 2: Receiver B Tag 2 Station B NA 2
# 3: Receiver B Tag 3 Station B NA 3
# 4: Receiver B Tag 3 Station B 221536 4
# 5: Receiver B Tag 3 Station B 1114 4
# 6: Receiver B Tag 3 Station B 425 4
# 7: Receiver B Tag 3 Station B 10728 5
# 8: Receiver C Tag 4 Station C 110131 6
# 9: Receiver C Tag 4 Station C 61 6
# 10: Receiver C Tag 4 Station C 43 6
另一个选项是
reprex_1[,访问:=rleid(
接收器号,发射器,
站名,cumsum(TimeDiff>1800 | is.na(TimeDiff)))]