R 长格式分离问题_R - Fatal编程技术网

R 长格式分离问题

R 长格式分离问题,r,R,从该数据帧： dftest <- data.frame(id = c(1), text = c("java-ee?jsf?omnifaces?jpa"), stringsAsFactors = F) 我使用以下命令使其： s2 <- strsplit(dftest$text, split = "?") dftest2 <- data.frame(id = rep(dftest2$id, sapply(s2, length)), text = unlist(s2)) dfl

从该数据帧：

dftest <-  data.frame(id = c(1), text = c("java-ee?jsf?omnifaces?jpa"), stringsAsFactors = F)

我使用以下命令使其：

s2 <- strsplit(dftest$text, split = "?")
dftest2 <- data.frame(id = rep(dftest2$id, sapply(s2, length)), text = unlist(s2))

dflike_final <- reshape(dftest2, idvar = "id", timevar = "text", direction = "wide")

如何修复它以获得整个字符串？

我们可以将

文本

放在单独的行中，创建一个虚拟列（

），并使用

pivot\u wide

以宽格式获取数据

library(dplyr)
library(tidyr)

dftest %>%
  separate_rows(text, sep = "\\?") %>%
  mutate(n = 1) %>%
  pivot_wider(values_from = n, names_from = text)

# A tibble: 1 x 5
#     id `java-ee`   jsf omnifaces   jpa
#  <dbl>     <dbl> <dbl>     <dbl> <dbl>
#1     1         1     1         1     1

？

是正则表达式中的特殊符号。您需要转义它或使用strsplit（dftest$text，split=“？”，fixed=TRUE）。

 id text
1   1         j
2   1         a
3   1         v
4   1         a
5   1         -
6   1         e
7   1         e
8   1         ?
9   1         j
10  1         s
11  1         f
12  1         ?
13  1         o
14  1         m
15  1         n
16  1         i
17  1         f
18  1         a
19  1         c
20  1         e
21  1         s
22  1         ?
23  1         j
24  1         p
25  1         a

library(dplyr)
library(tidyr)

dftest %>%
  separate_rows(text, sep = "\\?") %>%
  mutate(n = 1) %>%
  pivot_wider(values_from = n, names_from = text)

# A tibble: 1 x 5
#     id `java-ee`   jsf omnifaces   jpa
#  <dbl>     <dbl> <dbl>     <dbl> <dbl>
#1     1         1     1         1     1

s2 <- strsplit(dftest$text, split = "\\?")
dftest2 <- data.frame(id = rep(dftest$id, lengths(s2)), text = unlist(s2), n = 1)
dflike_final <- reshape(dftest2, idvar = "id", timevar = "text", direction = "wide")