R 如何将数据重塑为长格式?
我有一个.csv文件,如下所示:R 如何将数据重塑为长格式?,r,reshape,R,Reshape,我有一个.csv文件,如下所示: +-------+---------+------+-------+ | CONN | TABLE | COLS | OWNER | +-------+---------+------+-------+ | ONE | TABLE_A | 10 | MIKE | | ONE | TABLE_B | 9 | MIKE | | ONE | TAB_A | 11 | KIM | | ONE | TAB_B | 1
+-------+---------+------+-------+
| CONN | TABLE | COLS | OWNER |
+-------+---------+------+-------+
| ONE | TABLE_A | 10 | MIKE |
| ONE | TABLE_B | 9 | MIKE |
| ONE | TAB_A | 11 | KIM |
| ONE | TAB_B | 14 | KIM |
| TWO | TABLE_A | 9 | MIKE |
| TWO | TABLE_B | 9 | MIKE |
| TWO | TAB_A | 11 | KIM |
| TWO | TAB_D | 56 | KIM |
| THREE | TABLE_A | 9 | MIKE |
| THREE | TABLE_C | 3 | MIKE |
| THREE | TABLE_D | 11 | KIM |
| THREE | TAB_A | 11 | KIM |
+-------+---------+------+-------+
> reshape(dat, varying=c('TABLE', 'COLS'), v.names=c('CONN', 'OWNER'), direction='long')
CONN OWNER time id
1 TABLE_A 10 1 1
2 TABLE_B 9 1 2
3 TAB_A 11 1 3
4 TAB_B 14 1 4
5 TABLE_A 9 1 5
6 TABLE_B 9 1 6
7 TAB_A 11 1 7
8 TAB_D 56 1 8
9 TABLE_A 9 1 9
10 TABLE_C 3 1 10
11 TABLE_D 11 1 11
12 TAB_A 11 1 12
我想比较康恩和老板的桌子和可乐。我如何重塑此数据以进行此比较?我的数据如下:
dat <- structure(list(CONN = c("ONE", "ONE", "ONE", "ONE", "TWO", "TWO",
"TWO", "TWO", "THREE", "THREE", "THREE", "THREE"), TABLE = c("TABLE_A",
"TABLE_B", "TAB_A", "TAB_B", "TABLE_A", "TABLE_B", "TAB_A", "TAB_D",
"TABLE_A", "TABLE_C", "TABLE_D", "TAB_A"), COLS = c(10L, 9L,
11L, 14L, 9L, 9L, 11L, 56L, 9L, 3L, 11L, 11L), OWNER = c("MIKE",
"MIKE", "KIM", "KIM", "MIKE", "MIKE", "KIM", "KIM", "MIKE", "MIKE",
"KIM", "KIM")), .Names = c("CONN", "TABLE", "COLS", "OWNER"), class = "data.frame", row.names = c(NA,
-12L))
所有这些大写的列名都让你在shift键上有点粘——你做了
C('CONN','OWNER')
。小写字母c
的工作原理如下:
+-------+---------+------+-------+
| CONN | TABLE | COLS | OWNER |
+-------+---------+------+-------+
| ONE | TABLE_A | 10 | MIKE |
| ONE | TABLE_B | 9 | MIKE |
| ONE | TAB_A | 11 | KIM |
| ONE | TAB_B | 14 | KIM |
| TWO | TABLE_A | 9 | MIKE |
| TWO | TABLE_B | 9 | MIKE |
| TWO | TAB_A | 11 | KIM |
| TWO | TAB_D | 56 | KIM |
| THREE | TABLE_A | 9 | MIKE |
| THREE | TABLE_C | 3 | MIKE |
| THREE | TABLE_D | 11 | KIM |
| THREE | TAB_A | 11 | KIM |
+-------+---------+------+-------+
> reshape(dat, varying=c('TABLE', 'COLS'), v.names=c('CONN', 'OWNER'), direction='long')
CONN OWNER time id
1 TABLE_A 10 1 1
2 TABLE_B 9 1 2
3 TAB_A 11 1 3
4 TAB_B 14 1 4
5 TABLE_A 9 1 5
6 TABLE_B 9 1 6
7 TAB_A 11 1 7
8 TAB_D 56 1 8
9 TABLE_A 9 1 9
10 TABLE_C 3 1 10
11 TABLE_D 11 1 11
12 TAB_A 11 1 12
所有这些大写的列名都让你在shift键上有点粘——你做了
C('CONN','OWNER')
。小写字母c
的工作原理如下:
+-------+---------+------+-------+
| CONN | TABLE | COLS | OWNER |
+-------+---------+------+-------+
| ONE | TABLE_A | 10 | MIKE |
| ONE | TABLE_B | 9 | MIKE |
| ONE | TAB_A | 11 | KIM |
| ONE | TAB_B | 14 | KIM |
| TWO | TABLE_A | 9 | MIKE |
| TWO | TABLE_B | 9 | MIKE |
| TWO | TAB_A | 11 | KIM |
| TWO | TAB_D | 56 | KIM |
| THREE | TABLE_A | 9 | MIKE |
| THREE | TABLE_C | 3 | MIKE |
| THREE | TABLE_D | 11 | KIM |
| THREE | TAB_A | 11 | KIM |
+-------+---------+------+-------+
> reshape(dat, varying=c('TABLE', 'COLS'), v.names=c('CONN', 'OWNER'), direction='long')
CONN OWNER time id
1 TABLE_A 10 1 1
2 TABLE_B 9 1 2
3 TAB_A 11 1 3
4 TAB_B 14 1 4
5 TABLE_A 9 1 5
6 TABLE_B 9 1 6
7 TAB_A 11 1 7
8 TAB_D 56 1 8
9 TABLE_A 9 1 9
10 TABLE_C 3 1 10
11 TABLE_D 11 1 11
12 TAB_A 11 1 12
我通常觉得
重塑2
包更直观:只需将所需的行(对应的列)放在~
的前面(对应的后面)
dcast( CONN + OWNER ~ TABLE, data = dat, value.var="COLS" )
# CONN OWNER TAB_A TAB_B TAB_D TABLE_A TABLE_B TABLE_C TABLE_D
# 1 ONE KIM 11 14 NA NA NA NA NA
# 2 ONE MIKE NA NA NA 10 9 NA NA
# 3 THREE KIM 11 NA NA NA NA NA 11
# 4 THREE MIKE NA NA NA 9 NA 3 NA
# 5 TWO KIM 11 NA 56 NA NA NA NA
# 6 TWO MIKE NA NA NA 9 9 NA NA
我通常觉得
重塑2
包更直观:只需将所需的行(对应的列)放在~
的前面(对应的后面)
dcast( CONN + OWNER ~ TABLE, data = dat, value.var="COLS" )
# CONN OWNER TAB_A TAB_B TAB_D TABLE_A TABLE_B TABLE_C TABLE_D
# 1 ONE KIM 11 14 NA NA NA NA NA
# 2 ONE MIKE NA NA NA 10 9 NA NA
# 3 THREE KIM 11 NA NA NA NA NA 11
# 4 THREE MIKE NA NA NA 9 NA 3 NA
# 5 TWO KIM 11 NA 56 NA NA NA NA
# 6 TWO MIKE NA NA NA 9 9 NA NA
也可以使用tidyr
软件包中的gather()
函数:
Function: gather(data, key, value, ..., na.rm = FALSE, convert = FALSE)
Same as: data %>% gather(key, value, ..., na.rm = FALSE, convert = FALSE)
Arguments:
data: data frame
key: column name representing new variable
value: column name representing variable values
...: names of columns to gather (or not gather)
na.rm: option to remove observations with missing values (represented by NAs)
convert: if TRUE will automatically convert values to logical, integer, numeric, complex or
factor as appropriate
也可以使用tidyr
软件包中的gather()
函数:
Function: gather(data, key, value, ..., na.rm = FALSE, convert = FALSE)
Same as: data %>% gather(key, value, ..., na.rm = FALSE, convert = FALSE)
Arguments:
data: data frame
key: column name representing new variable
value: column name representing variable values
...: names of columns to gather (or not gather)
na.rm: option to remove observations with missing values (represented by NAs)
convert: if TRUE will automatically convert values to logical, integer, numeric, complex or
factor as appropriate
你能告诉我们你想要什么表格吗?我通常觉得重塑2更直观:你想要的和dcast(CONN+OWNER~table,data=dat,value.var=“COLS”)相似吗??谢谢@VincentZooneky,这就成功了:dcast(OWNER+CONN~table,data=g,value.var=“COLS”)如果你想回答这个问题,我可以接受这个答案。你能给我们看看你想要什么表格吗?我通常觉得
reforme2
更直观:你想要的和dcast(CONN+OWNER~table,data=dat,value.var=“COLS”)?谢谢@vincentzoonekyd这就成功了:dcast(OWNER+CONN~table,data=g,value.var=“COLS”)如果你愿意回答的话,我可以接受这个答案。