Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/77.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 从长格式到宽格式转换消除因子级别的计数频率(准备数据帧输入到iNEXT在线)_R - Fatal编程技术网

R 从长格式到宽格式转换消除因子级别的计数频率(准备数据帧输入到iNEXT在线)

R 从长格式到宽格式转换消除因子级别的计数频率(准备数据帧输入到iNEXT在线),r,R,我有一个数据框,看起来像: df<- data.frame(region= c("1","1","1","1","1","2","2","2","2","2","2"),loc=c("104","104","104","105","106","107","108", "109", "110", "110", "111"), interact= c("A_B", "B_C", "A_B", "B_C", "B_C", "A_B", "G_H", "I_J", "J_K", "L_M", "M

我有一个数据框,看起来像:

df<- data.frame(region= c("1","1","1","1","1","2","2","2","2","2","2"),loc=c("104","104","104","105","106","107","108", "109", "110", "110", "111"), interact= c("A_B", "B_C", "A_B", "B_C", "B_C", "A_B", "G_H", "I_J", "J_K", "L_M", "M_O"))
您会注意到,区域1的
loc
中有3个唯一级别,区域2的
loc
中有5个唯一级别;因此,第一行数字表示该区域唯一的
loc
计数。接下来的所有行表示该区域中所有loc之间每种交互类型的频率。但是,我不希望在最终数据帧中使用此
interact
列,因此最终输出应如下所示:

output<- data.frame(region1= c("3", "2", "3", "0","0","0","0","0"), 
region2= c("5", "1", "0", "1","1","1","1","1"))

我们可以使用
data.table

library(data.table)
d1 <- dcast(setDT(df)[, .(interact = "", uniqueN(loc)), region], 
         interact ~ paste0('region', region))
rbind(d1, dcast(df, interact ~ paste0('region', region), length))
#   interact region1 region2
#1:                3       5
#2:      A_B       2       1
#3:      B_C       3       0
#4:      G_H       0       1
#5:      I_J       0       1
#6:      J_K       0       1
#7:      L_M       0       1
#8:      M_O       0       1
library(tidyr)
df<- df %>% 
group_by(region, interact) %>% 
summarise(freq = n()) 
data_wide <- spread(df, region, freq)
data_wide<- data_wide[,-1]
library(data.table)
d1 <- dcast(setDT(df)[, .(interact = "", uniqueN(loc)), region], 
         interact ~ paste0('region', region))
rbind(d1, dcast(df, interact ~ paste0('region', region), length))
#   interact region1 region2
#1:                3       5
#2:      A_B       2       1
#3:      B_C       3       0
#4:      G_H       0       1
#5:      I_J       0       1
#6:      J_K       0       1
#7:      L_M       0       1
#8:      M_O       0       1
library(tidyverse)
bind_rows(df %>%
            group_by(region = paste0('region', region)) %>% 
            summarise(interact = "", V1 = n_distinct(loc)) %>% 
            spread(region, V1),
          df %>% 
            group_by(region = paste0('region', region),
                    interact = as.character(interact)) %>%
            summarise(V1 = n()) %>% 
            spread(region, V1, fill = 0))
# A tibble: 8 x 3
#  interact region1 region2
#     <chr>   <dbl>   <dbl>
#1                3       5
#2      A_B       2       1
#3      B_C       3       0
#4      G_H       0       1
#5      I_J       0       1
#6      J_K       0       1
#7      L_M       0       1
#8      M_O       0       1