在mutate中使用summary而不是left_连接
我想知道这个工作流程是否可以改进:在mutate中使用summary而不是left_连接,r,tidyverse,R,Tidyverse,我想知道这个工作流程是否可以改进: dummy <- tibble( x = c(rep("A", 5), rep("B", 5), rep("C", 5)), value = c(1:15) ) dummy %>% group_by(x) %>% summarise(rowsum = sum(value)) %>% mutate(s = sum(rowsum)) %>% left_join((dummy %>% pivot_long
dummy <- tibble(
x = c(rep("A", 5), rep("B", 5), rep("C", 5)),
value = c(1:15)
)
dummy %>%
group_by(x) %>%
summarise(rowsum = sum(value)) %>%
mutate(s = sum(rowsum)) %>%
left_join((dummy %>% pivot_longer(-x)), by = "x")
dummy%
分组依据(x)%>%
汇总(行总和=总和(值))%>%
变异(s=sum(rowsum))%>%
左连接((虚拟%>%pivot\u更长(-x)),通过=“x”)
理想情况下,我不希望使用调用原始数据帧的左连接-有人对此有更好的建议吗?我对
name
列的用途有点困惑,但这会复制您的输出,而不使用左连接
library(dplyr)
dummy %>%
group_by(x) %>%
mutate(row_sum = sum(value)) %>%
ungroup() %>%
mutate(s = sum(unique(row_sum)),
name = "value")
select(x, row_sum, s, name, value) # only to reorder the columns as you had them
# A tibble: 15 x 5
x row_sum s name value
<chr> <int> <int> <chr> <int>
1 A 15 120 value 1
2 A 15 120 value 2
3 A 15 120 value 3
4 A 15 120 value 4
5 A 15 120 value 5
6 B 40 120 value 6
7 B 40 120 value 7
8 B 40 120 value 8
9 B 40 120 value 9
10 B 40 120 value 10
11 C 65 120 value 11
12 C 65 120 value 12
13 C 65 120 value 13
14 C 65 120 value 14
15 C 65 120 value 15
库(dplyr)
虚拟%>%
分组依据(x)%>%
变异(行总和=总和(值))%>%
解组()%>%
mutate(s=sum(unique(row_sum)),
name=“value”)
选择(x,行_sum,s,name,value)#仅按原来的顺序对列重新排序
#一个tibble:15x5
x行和的名称值
1 A 15 120值1
2 A 15 120值2
3 A 15 120值3
4 A 15 120值4
5 A 15 120值5
6 B 40 120值6
7 B 40 120值7
8 B 40 120值8
9B 40120值9
10 B 40 120值10
11 C 65 120值11
12 C 65 120值12
13 C 65 120值13
14 C 65 120值14
15 C 65 120值15
一个选项是替换除第一个元素以外的元素,以NA
并获得和
library(dplyr)
library(tidyr)
dummy %>%
pivot_longer(-x) %>%
group_by(x) %>%
mutate(rowsum = sum(value),
s = replace(rowsum, row_number() != 1, NA)) %>%
ungroup %>%
mutate(s = sum(s, na.rm = TRUE))
# A tibble: 15 x 5
# x name value rowsum s
# <chr> <chr> <int> <int> <int>
# 1 A value 1 15 120
# 2 A value 2 15 120
# 3 A value 3 15 120
# 4 A value 4 15 120
# 5 A value 5 15 120
# 6 B value 6 40 120
# 7 B value 7 40 120
# 8 B value 8 40 120
# 9 B value 9 40 120
#10 B value 10 40 120
#11 C value 11 65 120
#12 C value 12 65 120
#13 C value 13 65 120
#14 C value 14 65 120
#15 C value 15 65 120
啊,很抱歉,名称列是一个错误-谢谢!
dummy %>%
mutate(s = sum(value)) %>%
group_by(x) %>%
mutate(rowsum = sum(value))