R：如何为每个主题的变量创建ID号_R_Dplyr_Tidyverse

R：如何为每个主题的变量创建ID号

R：如何为每个主题的变量创建ID号,r,dplyr,tidyverse,R,Dplyr,Tidyverse,我的数据如下所示： Subject Time Duration 1 0.1s 0.006 1 0.5s 0 1 1s 0.733 .. 2 0.3s 0 2 0.5s 0.553 2 2s 0.344 .. Subject Time Duration ID 1 0.1s 0.006 1 1 0.5s 0 NA 1 1s 0.7

我的数据如下所示：

Subject Time Duration 
1       0.1s   0.006
1       0.5s   0
1       1s     0.733
..
2       0.3s   0
2       0.5s   0.553
2       2s     0.344
..

Subject Time Duration ID
1       0.1s   0.006  1
1       0.5s   0      NA
1       1s     0.733  2
..
2       0.3s   0      NA
2       0.5s   0.553  1
2       2s     0.344  2
..

我想根据每个主题的可变时间来订购可变持续时间，不包括0值

我想创建一个额外的序号ID变量，它将根据时间排序。因此，我的数据如下所示：

Subject Time Duration 
1       0.1s   0.006
1       0.5s   0
1       1s     0.733
..
2       0.3s   0
2       0.5s   0.553
2       2s     0.344
..

Subject Time Duration ID
1       0.1s   0.006  1
1       0.5s   0      NA
1       1s     0.733  2
..
2       0.3s   0      NA
2       0.5s   0.553  1
2       2s     0.344  2
..

我尝试了dplyr:：mutatedataframe1，ID=行号但我不知道如何为每个主题进行变异，以及如何在Duration变量中排除0或NA值。有人能帮我解决这个问题吗？谢谢。

我们可以使用data.table中的rowid

或者使用base R

使现代化有关最新问题

df2 %>%
    group_by(Subject) %>%
    mutate(ID = replace(rep(NA_integer_, n()), Duration > 0, 
         row_number(readr::parse_number(Time[Duration > 0]))))
# A tibble: 6 x 4
# Groups:   Subject [2]
#  Subject Time  Duration    ID
#    <int> <chr>    <dbl> <int>
#1       1 0.1s     0.006     1
#2       1 0.5s     0        NA
#3       1 1s       0.733     2
#4       2 0.3s     0        NA
#5       2 0.5s     0.553     1
#6       2 2s       0.344     2

数据我们可以使用data.table中的rowid

或者使用base R

使现代化有关最新问题

df2 %>%
    group_by(Subject) %>%
    mutate(ID = replace(rep(NA_integer_, n()), Duration > 0, 
         row_number(readr::parse_number(Time[Duration > 0]))))
# A tibble: 6 x 4
# Groups:   Subject [2]
#  Subject Time  Duration    ID
#    <int> <chr>    <dbl> <int>
#1       1 0.1s     0.006     1
#2       1 0.5s     0        NA
#3       1 1s       0.733     2
#4       2 0.3s     0        NA
#5       2 0.5s     0.553     1
#6       2 2s       0.344     2

数据

对于Tidyverse答案，请不要忘记在分配ID之前安排时间。谢谢，但它们都不适合我，我想根据每个主题的时间变量对持续时间变量进行排序。@TkShmk根据您展示的示例，我得到了预期的输出though@TkShmk请注意您的“持续时间”，“时间”不是数字列，即不被视为数字。我更正了我的问题。我是否可以指定我只想对大于0的持续时间变量值进行排序？对于Tidyverse答案，请不要忘记在分配ID之前安排时间。谢谢，但它们对我都不起作用，我想根据每个主题的时间变量对持续时间变量进行排序。@TkShmk基于您展示的示例，我得到了预期的输出though@TkShmk请注意，您的“持续时间”、“时间”不是数字列，即不被视为数字。我更正了我的问题。我是否可以指定只对大于0的持续时间变量值进行排序？

df2 %>%
    group_by(Subject) %>%
    mutate(ID = replace(rep(NA_integer_, n()), Duration > 0, 
         row_number(readr::parse_number(Time[Duration > 0]))))
# A tibble: 6 x 4
# Groups:   Subject [2]
#  Subject Time  Duration    ID
#    <int> <chr>    <dbl> <int>
#1       1 0.1s     0.006     1
#2       1 0.5s     0        NA
#3       1 1s       0.733     2
#4       2 0.3s     0        NA
#5       2 0.5s     0.553     1
#6       2 2s       0.344     2

df1 <- structure(list(Subject = c(1L, 1L, 1L, 2L, 2L, 2L), Time = c("0.1s", 
"0.5s", "1s", "0.3s", "0.5s", "2s"), Duration = c("0,006", "0,663", 
"0,733", "0,002", "0,553", "0,344")), class = "data.frame", row.names = c(NA, 
-6L))




df2 <-structure(list(Subject = c(1L, 1L, 1L, 2L, 2L, 2L), Time = c("0.1s", 
"0.5s", "1s", "0.3s", "0.5s", "2s"), Duration = c(0.006, 0, 0.733, 
0, 0.553, 0.344)), class = "data.frame", row.names = c(NA, -6L
))