将具有多个唯一变量的长格式转换为宽格式,再转换为R中的其他唯一变量
我试图将长格式转换为宽格式,但多列表示唯一的行。 在下面的示例中,将具有多个唯一变量的长格式转换为宽格式,再转换为R中的其他唯一变量,r,casting,type-conversion,reshape2,R,Casting,Type Conversion,Reshape2,我试图将长格式转换为宽格式,但多列表示唯一的行。 在下面的示例中,块、密度、物种列表示唯一的个体。每个人都有2或3行与面积和大小关联。我想将面积和大小转换为宽格式 这是我的数据集 block <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2) species <- c("A","A","A","A","B","B","B","B","A","A","A","A","B","B","B","B","B") den <- c("20","20","50"
块、密度、物种
列表示唯一的个体。每个人都有2或3行与面积和大小关联。我想将面积和大小转换为宽格式
这是我的数据集
block <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2)
species <- c("A","A","A","A","B","B","B","B","A","A","A","A","B","B","B","B","B")
den <- c("20","20","50","50","20","20","50","50","20","20","50","50","20","20","50","50","50")
block <- as.factor(block)
den <- as.factor(den)
species <- as.factor(species)
area <- c(1:17)
size <- c(17:33)
df <- data.frame(block, species, den, area, size)
注意:我所讨论的其他答案不使用多列来表示行的唯一性我们可以在按组创建序列列后使用
pivot\u wide
library(dplyr)
library(tidyr)
df %>%
group_by(block, species, den) %>%
mutate(rn = row_number()) %>%
ungroup %>%
pivot_wider(names_from = rn, values_from = c(area, size), names_sep = ".")
# A tibble: 8 x 9
# block species den area.1 area.2 area.3 size.1 size.2 size.3
# <fct> <fct> <fct> <int> <int> <int> <int> <int> <int>
#1 1 A 20 1 2 NA 17 18 NA
#2 1 A 50 3 4 NA 19 20 NA
#3 1 B 20 5 6 NA 21 22 NA
#4 1 B 50 7 8 NA 23 24 NA
#5 2 A 20 9 10 NA 25 26 NA
#6 2 A 50 11 12 NA 27 28 NA
#7 2 B 20 13 14 NA 29 30 NA
#8 2 B 50 15 16 17 31 32 33
库(dplyr)
图书馆(tidyr)
df%>%
分组依据(区块、物种、巢穴)%>%
变异(rn=行数())%>%
解组%>%
pivot_加宽(名称_from=rn,值_from=c(面积、大小),名称_sep=“.”)
#一个tibble:8x9
#区块物种巢穴区域。1个区域。2个区域。3个大小。1个大小。2个大小。3
#
#1 1 A 20 1 2 NA 17 18 NA
#2 1 A 50 3 4 NA 19 20 NA
#3 1 B 20 5 6 NA 21 22 NA
#4 1 B 50 7 8 NA 23 24 NA
#52A20910NA 2526NA
#6 2 A 50 11 12 NA 27 28 NA
#7 2 B 20 13 14 NA 29 30 NA
#82B5015173233
使用data.table
:库(data.table);setDT(df);dcast(df,block+species+den~rowid(block,species,den),value.var=c(“面积”,“大小”)
library(dplyr)
library(tidyr)
df %>%
group_by(block, species, den) %>%
mutate(rn = row_number()) %>%
ungroup %>%
pivot_wider(names_from = rn, values_from = c(area, size), names_sep = ".")
# A tibble: 8 x 9
# block species den area.1 area.2 area.3 size.1 size.2 size.3
# <fct> <fct> <fct> <int> <int> <int> <int> <int> <int>
#1 1 A 20 1 2 NA 17 18 NA
#2 1 A 50 3 4 NA 19 20 NA
#3 1 B 20 5 6 NA 21 22 NA
#4 1 B 50 7 8 NA 23 24 NA
#5 2 A 20 9 10 NA 25 26 NA
#6 2 A 50 11 12 NA 27 28 NA
#7 2 B 20 13 14 NA 29 30 NA
#8 2 B 50 15 16 17 31 32 33