R-rbind逻辑
我有这个数据框:R-rbind逻辑,r,reshape,rbind,R,Reshape,Rbind,我有这个数据框: source_data <- data.frame( "date" = c("2018-01-01", "2018-01-01", "2018-02-01", "2018-02-01"), "nr" = c(0, 1, 0, 1), "marketing_fees" = c(500, 600, 800, 900), "services_paid" = c(40, 50, 10, 30), stringsAsFactors = F)
source_data <- data.frame(
"date" = c("2018-01-01", "2018-01-01", "2018-02-01", "2018-02-01"),
"nr" = c(0, 1, 0, 1),
"marketing_fees" = c(500, 600, 800, 900),
"services_paid" = c(40, 50, 10, 30),
stringsAsFactors = F)
源数据%
mutate(source=“marketing”),
源\u数据%>%
过滤器(日期==“2018-01-01”)%>%
选择(日期、nr、收入=已支付的服务)%>%
mutate(source=“services”),
源\u数据%>%
过滤器(日期==“2018-02-01”)%>%
选择(日期、nr、收入=营销费用)%>%
mutate(source=“marketing”),
源\u数据%>%
过滤器(日期==“2018-02-01”)%>%
选择(日期、nr、收入=已支付的服务)%>%
变异(source=“services”)
)
上面的代码不仅仅是有很多重复的部分,我不能再这样使用它了,因为我的数据框架有大约50列和很多数据。如果没有这么多重复的代码,您如何实现结果数据帧?我们可以使用
聚集
将“宽”改为“长”,然后分离列名以仅返回前缀部分
library(tidyverse)
source_data %>%
gather(source, income, marketing_fees:services_paid) %>%
separate(source, into = c('source', 'extra')) %>%
select(-extra) %>%
arrange(date, nr)
# date nr source income
#1 2018-01-01 0 marketing 500
#2 2018-01-01 0 services 40
#3 2018-01-01 1 marketing 600
#4 2018-01-01 1 services 50
#5 2018-02-01 0 marketing 800
#6 2018-02-01 0 services 10
#7 2018-02-01 1 marketing 900
#8 2018-02-01 1 services 30
库(data.table)
图书馆(magrittr)
结果2就我所见,这是重塑和一些基本的文本处理。我将发布一个答案作为证据。请注意,这与前面提到的重新打开的逻辑相似,我看不出有任何区别。规则应该与每个人相似,而不是针对其他人
result <- rbind(
source_data %>%
filter(date == "2018-01-01") %>%
select(date, nr, income = marketing_fees) %>%
mutate(source = "marketing"),
source_data %>%
filter(date == "2018-01-01") %>%
select(date, nr, income = services_paid) %>%
mutate(source = "services"),
source_data %>%
filter(date == "2018-02-01") %>%
select(date, nr, income = marketing_fees) %>%
mutate(source = "marketing"),
source_data %>%
filter(date == "2018-02-01") %>%
select(date, nr, income = services_paid) %>%
mutate(source = "services")
)
library(tidyverse)
source_data %>%
gather(source, income, marketing_fees:services_paid) %>%
separate(source, into = c('source', 'extra')) %>%
select(-extra) %>%
arrange(date, nr)
# date nr source income
#1 2018-01-01 0 marketing 500
#2 2018-01-01 0 services 40
#3 2018-01-01 1 marketing 600
#4 2018-01-01 1 services 50
#5 2018-02-01 0 marketing 800
#6 2018-02-01 0 services 10
#7 2018-02-01 1 marketing 900
#8 2018-02-01 1 services 30
library(data.table)
library(magrittr)
result2 <- melt(
setDT(source_data),
id.vars = c("date", "nr"),
value.name = "income",
variable.name = "source"
)[, source := sub("_.*", "", source)][order(date, nr)]°
date nr source income
1: 2018-01-01 0 marketing 500
2: 2018-01-01 0 services 40
3: 2018-01-01 1 marketing 600
4: 2018-01-01 1 services 50
5: 2018-02-01 0 marketing 800
6: 2018-02-01 0 services 10
7: 2018-02-01 1 marketing 900
8: 2018-02-01 1 services 30