在R-dplyr tidyr解决方案中分散在多个色谱柱上
我有一个很长的格式的年度数据,我试图用两列来扩展它。我见过的唯一例子包括一个在R-dplyr tidyr解决方案中分散在多个色谱柱上,r,dplyr,tidyr,R,Dplyr,Tidyr,我有一个很长的格式的年度数据,我试图用两列来扩展它。我见过的唯一例子包括一个键 > dput(df) structure(list(ID = c("a", "a", "a", "a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b", "b", "b", "b"), Year = c(2015L, 2015L, 2015L, 2016L, 2016L, 2016L, 2017L, 2017L, 2017L, 2015L, 2
键
> dput(df)
structure(list(ID = c("a", "a", "a", "a", "a", "a", "a", "a",
"a", "b", "b", "b", "b", "b", "b", "b", "b", "b"), Year = c(2015L,
2015L, 2015L, 2016L, 2016L, 2016L, 2017L, 2017L, 2017L, 2015L,
2015L, 2015L, 2016L, 2016L, 2016L, 2017L, 2017L, 2017L), Month = c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L), Value = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L)), .Names = c("ID", "Year", "Month",
"Value"), class = "data.frame", row.names = c(NA, -18L))
我正试图将其转换为一种数据格式,其中年份列为2:5,每个月一行每个ID
ID Month 2015 2016 2017
a 1 1 2 3
a 2 1 2 3
a 3 1 2 3
a 1 6 9 12
a 2 7 10 13
a 3 8 11 14
我尝试了以下操作,但出现以下错误:
by_month_over_years = spread(df,key = c(Year,Month), Value)
Error: `var` must evaluate to a single number or a column name, not an integer vector
library(tidyr)
图书馆(dplyr)
df%>%集团(ID)%>%利差(年份、价值)
#一个tibble:6x5
#组别:ID[2]
ID月'2015``2016``2017`
1 a 11 2 3
2 a 2 1 2 3
3 a 3 1 2 3
4b16912
5B2771013
6B381114
这里有一个基本R
选项,带有重塑
reshape(df, idvar = c('ID', 'Month'), direction = 'wide', timevar = 'Year')
# ID Month Value.2015 Value.2016 Value.2017
#1 a 1 1 2 3
#2 a 2 1 2 3
#3 a 3 1 2 3
#10 b 1 6 9 12
#11 b 2 7 10 13
#12 b 3 8 11 14
是不是spread(df,Year,Value)
这样做了?你看了吗?我认为你不需要分组步骤。spread(df,Year,Value)
做的不是同样的事情吗?@aosmith你是对的,的确spread(df,Year,Value)
可以做group_by
只是一个额外的注意步骤。从这个命令中学习到的东西真是太棒了,谢谢分享。你能解释一下idvar、方向和时间变量吗?虽然我在谷歌上查过,但如果你能在这里添加更多解释就好了。@RavinderSingh13idvar
与你在melt
中使用的类似(id.var
),direction
是转换的方向(这里我们将long改为'wide',并且timevar
根据'timevar'中的变量拆分'Value'列)
library(reshape2) # or data.table, for dcast
dcast(df, ID + Month ~ Year)
# ID Month 2015 2016 2017
# 1 a 1 1 2 3
# 2 a 2 1 2 3
# 3 a 3 1 2 3
# 4 b 1 6 9 12
# 5 b 2 7 10 13
# 6 b 3 8 11 14
reshape(df, idvar = c('ID', 'Month'), direction = 'wide', timevar = 'Year')
# ID Month Value.2015 Value.2016 Value.2017
#1 a 1 1 2 3
#2 a 2 1 2 3
#3 a 3 1 2 3
#10 b 1 6 9 12
#11 b 2 7 10 13
#12 b 3 8 11 14