Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/71.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在R中按顺序合并数据帧对_R - Fatal编程技术网

在R中按顺序合并数据帧对

在R中按顺序合并数据帧对,r,R,我有一个数据框架,其中包含来自多个采样间隔的多个站点的标记个体。见下例: > df Tag Site Interval Ind_ID 1 507 Golden 7 1 2 507 Golden 8 1 3 552 Golden 2 1 4 552 Golden 1 1 5 847 Golden 4 1 6 847 Golden 6

我有一个数据框架,其中包含来自多个采样间隔的多个站点的标记个体。见下例:

> df
   Tag   Site Interval Ind_ID
1  507 Golden        7      1
2  507 Golden        8      1
3  552 Golden        2      1
4  552 Golden        1      1
5  847 Golden        4      1
6  847 Golden        6      1
8  847 Golden        5      1
9  847 Golden        3      1
31 541 Golden        1      1
33 541 Golden        3      1
34 541 Golden        4      1
35 541 Golden        7      1
36 541 Golden        6      1
37 541 Golden        5      1
39 810 Golden        7      1
40 810 Golden        8      1
41 840 Golden        7      1
42 840 Golden        8      1
43 840 Golden        3      1
44 840 Golden        2      1
我想做的是按时间间隔分离标记的个体,我已经用这个for循环完成了:

for (i in 1:nlevels(factor(df$Interval))){
  I<-subset(df,Interval==levels(factor(df$Interval))[i])
  assign(paste("Interval_", i, sep = ""), I)}
for(i/1:nlevels(因子(df$区间))){

我可能是这样的:

dfs <- split(df,df$Interval)
n <- nlevels(factor(df$Interval))-1
results <- setNames(vector("list",length = n),paste0("IPl",2:(n+1)))
for (i in seq_len(n)){
    results[[i]] <- merge(dfs[[i]],dfs[[i+1]],by = c('Tag','Site','Ind_ID'))
}

> head(results)

$IPl2
  Tag   Site Ind_ID Interval.x Interval.y
1 552 Golden      1          1          2

$IPl3
  Tag   Site Ind_ID Interval.x Interval.y
1 840 Golden      1          2          3

$IPl4
  Tag   Site Ind_ID Interval.x Interval.y
1 541 Golden      1          3          4
2 847 Golden      1          3          4

$IPl5
  Tag   Site Ind_ID Interval.x Interval.y
1 541 Golden      1          4          5
2 847 Golden      1          4          5

$IPl6
  Tag   Site Ind_ID Interval.x Interval.y
1 541 Golden      1          5          6
2 847 Golden      1          5          6

$IPl7
  Tag   Site Ind_ID Interval.x Interval.y
1 541 Golden      1          6          7

dfs下面是一个
dplyr
解决方案,它将数据帧与其自身连接起来,并将结果放入数据帧中

library(dplyr)
## Join the 'df' to itself based on the intervals to compare; this is done by
## creating a key to indicate which intervals to join on.
resultdf <-
    ## Create match_interval to next sequential value
    df %>% mutate(match_interval = paste0('IPl', as.numeric(Interval)+1)) %>% arrange(Interval, Site) %>%
    ## Join to self by match_interval and other columns.
    inner_join(df %>% mutate(match_interval = paste0('IPl', as.numeric(Interval))),
               by = c('Tag', 'Site', 'Ind_ID', 'match_interval')) %>%
    ## Order columns
    select(match_interval, Tag, Site, Ind_ID, Interval.x, Interval.y)


resultsdf

##    match_interval Tag   Site Ind_ID Interval.x Interval.y
## 1            IPl2 552 Golden      1          1          2
## 2            IPl3 840 Golden      1          2          3
## 3            IPl4 847 Golden      1          3          4
## 4            IPl4 541 Golden      1          3          4
## 5            IPl5 847 Golden      1          4          5
## 6            IPl5 541 Golden      1          4          5
## 7            IPl6 847 Golden      1          5          6
## 8            IPl6 541 Golden      1          5          6
## 9            IPl7 541 Golden      1          6          7
## 10           IPl8 507 Golden      1          7          8
## 11           IPl8 810 Golden      1          7          8
## 12           IPl8 840 Golden      1          7          8
库(dplyr)
##根据要比较的间隔将“df”加入到自身中;这是由
##创建一个键以指示要连接的间隔。
resultdf%变异(match_interval=paste0('IPl',as.numeric(interval)+1))%%>%arrange(interval,Site)%%>%
##通过match_interval和其他列连接到self。
内部连接(df%>%mutate(match_interval=paste0('IPl',as.numeric(interval)),
by=c('Tag'、'Site'、'Ind\u ID'、'match\u interval'))%>%
##订单列
选择(匹配间隔、标记、站点、索引ID、间隔.x、间隔.y)
结果DF
##匹配间隔标记站点标识间隔.x间隔.y
##1 IPl2 552金色1 1 2
##2 IPl3 840金色1 2 3
##3 IPl4 847金色1 3 4
##4 IPl4 541金色1 3 4
##5 IPl5 847金色14 5
##6 IPL5541金色1 4 5
##7 IPl6 847金色1 5 6
##8 IPl6 541黄金1 5 6
##9 IPl7 541黄金16 7
##10 IPl8 507黄金17 8
##11 IPl8 810金色17 8
##12 IPl8 840金色1 7 8

您可能需要查看split()。
library(dplyr)
## Join the 'df' to itself based on the intervals to compare; this is done by
## creating a key to indicate which intervals to join on.
resultdf <-
    ## Create match_interval to next sequential value
    df %>% mutate(match_interval = paste0('IPl', as.numeric(Interval)+1)) %>% arrange(Interval, Site) %>%
    ## Join to self by match_interval and other columns.
    inner_join(df %>% mutate(match_interval = paste0('IPl', as.numeric(Interval))),
               by = c('Tag', 'Site', 'Ind_ID', 'match_interval')) %>%
    ## Order columns
    select(match_interval, Tag, Site, Ind_ID, Interval.x, Interval.y)


resultsdf

##    match_interval Tag   Site Ind_ID Interval.x Interval.y
## 1            IPl2 552 Golden      1          1          2
## 2            IPl3 840 Golden      1          2          3
## 3            IPl4 847 Golden      1          3          4
## 4            IPl4 541 Golden      1          3          4
## 5            IPl5 847 Golden      1          4          5
## 6            IPl5 541 Golden      1          4          5
## 7            IPl6 847 Golden      1          5          6
## 8            IPl6 541 Golden      1          5          6
## 9            IPl7 541 Golden      1          6          7
## 10           IPl8 507 Golden      1          7          8
## 11           IPl8 810 Golden      1          7          8
## 12           IPl8 840 Golden      1          7          8