R 如何在具有相似列名的数据框中添加合并列?
我有一个大型数据框,其中有几个列需要根据字符串的第一部分(在.S*之前)进行额外合并 使用此代码可以生成此示例数据帧R 如何在具有相似列名的数据框中添加合并列?,r,matrix,sparse-matrix,R,Matrix,Sparse Matrix,我有一个大型数据框,其中有几个列需要根据字符串的第一部分(在.S*之前)进行额外合并 使用此代码可以生成此示例数据帧 DF1 = structure(list(taxonomy = c("cat", "dog","horse","mouse","frog", "lion"), A = c(0L, 5L, 3L, 0L, 0L, 0L), D = c(2L, 1L, 0L, 0L, 2L, 0L), C = c(0L, 0L, 0L, 4L, 4L, 2L)),
DF1 = structure(list(taxonomy = c("cat", "dog","horse","mouse","frog", "lion"),
A = c(0L, 5L, 3L, 0L, 0L, 0L), D = c(2L, 1L, 0L, 0L, 2L, 0L), C = c(0L, 0L, 0L, 4L, 4L, 2L)),
.Names = c("taxonomy", "A.S595", "B.S596", "B.S487"),
row.names = c(NA, -6L), class = "data.frame")
此文件如下所示:
taxonomy A.S595 B.S596 B.S487
1 cat 0 2 0
2 dog 5 1 0
3 horse 3 0 0
4 mouse 0 0 4
5 frog 0 2 4
6 lion 0 0 2
我希望输出像这样
taxonomy A B
1 cat 0 2
2 dog 5 1
3 horse 3 0
4 mouse 0 4
5 frog 0 6
6 lion 0 2
一个选项是基于整数列的名称对数据集进行
拆分,循环遍历列表
,获得行和
和cbind
cbind(DF1[1], sapply(split.default(DF1[-1], substr(names(DF1)[-1], 1, 1)), rowSums))
# taxonomy A B
#1 cat 0 2
#2 dog 5 1
#3 horse 3 0
#4 mouse 0 4
#5 frog 0 6
#6 lion 0 2
或者使用tidyverse
library(tidyverse)
rownames_to_column(DF1) %>%
gather(key, val, -taxonomy, -rowname) %>%
separate(key, into = c('key1', 'key2')) %>%
group_by(rowname, key1) %>%
summarise(val = sum(val)) %>%
spread(key1, val) %>%
ungroup %>%
select(-rowname) %>%
bind_cols(DF1[1], .)
一个选项是基于整数列的名称对数据集进行拆分,循环遍历列表
,获得行和
和cbind
cbind(DF1[1], sapply(split.default(DF1[-1], substr(names(DF1)[-1], 1, 1)), rowSums))
# taxonomy A B
#1 cat 0 2
#2 dog 5 1
#3 horse 3 0
#4 mouse 0 4
#5 frog 0 6
#6 lion 0 2
或者使用tidyverse
library(tidyverse)
rownames_to_column(DF1) %>%
gather(key, val, -taxonomy, -rowname) %>%
separate(key, into = c('key1', 'key2')) %>%
group_by(rowname, key1) %>%
summarise(val = sum(val)) %>%
spread(key1, val) %>%
ungroup %>%
select(-rowname) %>%
bind_cols(DF1[1], .)
另一个版本使用的是tidyverse
:
DF1 %>%
select(matches("^B\\.S.*")) %>%
rowSums %>%
bind_cols(
select(DF1, -matches("^B\\.S.*")),
B = .
) %>%
rename_at(vars(matches("\\.S[0-9]+")), funs(gsub("\\.S[0-9]+", "", .)))
taxonomy A B
1 cat 0 2
2 dog 5 1
3 horse 3 0
4 mouse 0 4
5 frog 0 6
6 lion 0 2
另一个版本使用的是tidyverse
:
DF1 %>%
select(matches("^B\\.S.*")) %>%
rowSums %>%
bind_cols(
select(DF1, -matches("^B\\.S.*")),
B = .
) %>%
rename_at(vars(matches("\\.S[0-9]+")), funs(gsub("\\.S[0-9]+", "", .)))
taxonomy A B
1 cat 0 2
2 dog 5 1
3 horse 3 0
4 mouse 0 4
5 frog 0 6
6 lion 0 2