Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/77.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/go/7.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何使用dplyr计算首次出现值的重复次数_R_Dplyr - Fatal编程技术网

如何使用dplyr计算首次出现值的重复次数

如何使用dplyr计算首次出现值的重复次数,r,dplyr,R,Dplyr,我有一个包含组的数据框架,基本上如下所示 DF <- data.frame(state = c(rep("A", 3), rep("B",2), rep("A",2))) DF state 1 A 2 A 3 A 4 B 5 B 6 A 7 A 本例中的结果为5。因此,建议(最好是)dplyr解决方案将不胜感激。您可以尝试: rle(as.character(DF$state))$lengths[1] [1] 3 在您的dp

我有一个包含组的数据框架,基本上如下所示

DF <- data.frame(state = c(rep("A", 3), rep("B",2), rep("A",2)))

DF
  state
1     A
2     A
3     A
4     B
5     B
6     A
7     A
本例中的结果为5。因此,建议(最好是)dplyr解决方案将不胜感激。

您可以尝试:

rle(as.character(DF$state))$lengths[1]
[1] 3
在您的
dplyr
链中:

DF %>% summarize(count_first = rle(as.character(state))$lengths[1])

#   count_first
# 1           3
或者过度使用管道,使用
dplyr
magrittr

library(dplyr)
library(magrittr)
DF %>% summarize(count_first = state %>%
                   as.character %>%
                   rle %$%
                   lengths %>%
                   first)

#   count_first
# 1           3
也适用于分组数据:

DF <- data.frame(group = c(rep(1,4),rep(2,3)),state = c(rep("A", 3), rep("B",2), rep("A",2)))

#   group state
# 1     1     A
# 2     1     A
# 3     1     A
# 4     1     B
# 5     2     B
# 6     2     A
# 7     2     A

DF %>% group_by(group) %>% summarize(count_first = rle(as.character(state))$lengths[1])

# # A tibble: 2 x 2
#    group count_first
#    <dbl>       <int>
#  1     1           3
#  2     2           1
DF%group\u by(group)%%>%summary(count\u first=rle(as.character(state))$length[1])
##tibble:2x2
#先进行分组计数
#           
#  1     1           3
#  2     2           1

此处不需要
dplyr
,但您可以修改此示例以将其与
dplyr
一起使用。关键是功能
rle

state = c(rep("A", 3), rep("B",2), rep("A",2))

x = rle(state)
DF = data.frame(len = x$lengths, state = x$values)
DF

# get the longest run of consecutive "A"
max(DF[DF$state == "A",]$len)

好的,谢谢。这似乎有效。我现在需要研究如何将其应用于子组。假设我有一个分组变量
ID
,并希望每个ID值都有这个计数。您调用的
$state
只是一个向量,所以组处理不正确,只需使用
state
,这样
dplyr
就可以操作它的magicOK了,非常感谢!恐怕我无意中删除了您在这里提到的评论,但我想我是建议Df%>%group_by(ID)%%>%SUMMARY(r=rle(.$state)$LENGS[1]),
state = c(rep("A", 3), rep("B",2), rep("A",2))

x = rle(state)
DF = data.frame(len = x$lengths, state = x$values)
DF

# get the longest run of consecutive "A"
max(DF[DF$state == "A",]$len)