Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/67.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/joomla/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 比较各组的第一次和最后一次观察结果_R - Fatal编程技术网

R 比较各组的第一次和最后一次观察结果

R 比较各组的第一次和最后一次观察结果,r,R,我有这样一个数据集: df <- data.frame(group = c(rep(1,3),rep(2,2), rep(3,3),rep(4,3),rep(5, 2)), score = c(30, 10, 22, 44, 50, 5, 20, 1,35, 2, 60, 14,5)) group score 1 1 30 2 1 10 3 1 22 4 2 44 5 2 50 6 3

我有这样一个数据集:

df <- data.frame(group = c(rep(1,3),rep(2,2), rep(3,3),rep(4,3),rep(5, 2)), score = c(30, 10, 22, 44, 50, 5, 20, 1,35, 2, 60, 14,5))

   group score
1      1    30
2      1    10
3      1    22
4      2    44
5      2    50
6      3     5
7      3    20
8      3     1
9      4    35
10     4     2
11     4    60
12     5    14
13     5     5

有人知道如何实现这一点吗?

这个答案假设每个小组至少有两个观察结果:

newdf <- merge(rbind(df[diff(df$group) == 1 ,] , df[dim(df)[1], ]), 
           df[!duplicated(df$group), ],
           by="group")

newdf[which(newdf$score.x < newdf$score.y), 'group']
#[1] 1 3 5 

这应该可以做到:

# First split the data frame by group
# This returns a list
df.split <- split(df, factor(df$group))

# Now use sapply on the list to check first and last of each group
# We return the group or NA using ifelse
res <- sapply(df.split, 
       function(x){ifelse(x$score[1] > x$score[nrow(x)], x$group[1], NA)})

# Finally, filter away the NAs
res <- res[!is.na(res)]
#首先按组拆分数据帧
#这将返回一个列表

df.split再增加一个基本R选项:

with(df, unique(df$group[as.logical(ave(score, group, FUN = function(x) head(x,1) > tail(x, 1)))]))
#[1] 1 3 5
或使用dplyr

library(dplyr)
group_by(df, group) %>% filter(first(score) > last(score)) %>% do(head(.,1)) %>% 
 select(group)

#  group
#1     1
#2     3
#3     5

下面是
数据表
方法

library(data.table)
setDT(df)[, score[1] > score[.N], by = group][V1 == TRUE]

##    group   V1
## 1:     1 TRUE
## 2:     3 TRUE
## 3:     5 TRUE


根据@初学者的评论,如果你不喜欢
V1
,你可以这样做

df2 <- as.data.table(df)[, .BY[score[1] > score[.N]], by = group][, V1 := NULL]
df2

##    group
## 1:     1
## 2:     3
## 3:     5
df2分数[.N]],by=group][,V1:=NULL]
df2
##团体
## 1:     1
## 2:     3
## 3:     5
我很开心

library(plyr)
df1<-ddply(df,.(group),summarise,shown=score[length(group)]<score[1])
subset(df1,shown)

group shown
1     TRUE
3     TRUE
5     TRUE
库(plyr)
df1
setDT(df)[, group[score[1] > score[.N]], by = group]

##    group V1
## 1:     1  1
## 2:     3  3
## 3:     5  5
setDT(df)[, .BY[score[1] > score[.N]], by = group]
df2 <- as.data.table(df)[, .BY[score[1] > score[.N]], by = group][, V1 := NULL]
df2

##    group
## 1:     1
## 2:     3
## 3:     5
library(plyr)
df1<-ddply(df,.(group),summarise,shown=score[length(group)]<score[1])
subset(df1,shown)

group shown
1     TRUE
3     TRUE
5     TRUE