R 用行值生成列
我假设在这个问题上已经有了一些答案。但是我找不到它 我有以下数据:R 用行值生成列,r,tidyverse,R,Tidyverse,我假设在这个问题上已经有了一些答案。但是我找不到它 我有以下数据: library(tidyverse) glimpse(samp) Observations: 5 Variables: 2 $ business_id <chr> "--6MefnULPED_I942VcFNA", "--9e1ONYQuAa-CB_Rrw7Tw", "--... $ Ambience <chr> "romantic': False, 'intimate': False, 'clas
library(tidyverse)
glimpse(samp)
Observations: 5
Variables: 2
$ business_id <chr> "--6MefnULPED_I942VcFNA", "--9e1ONYQuAa-CB_Rrw7Tw", "--...
$ Ambience <chr> "romantic': False, 'intimate': False, 'classy': False, ...
但我得到了这个错误:
Error: Duplicate identifiers for rows (2, 3, 4, 5, 6, 7, 8), (10, 11, 12, 13, 14, 15, 16, 17), (19, 20, 21, 22, 23, 24, 25, 26), (28, 29, 30, 31, 32, 33, 34, 35), (37, 38, 39, 40, 41, 42, 43) Call `rlang::last_error()` to see a backtrace
这是一个dput:
structure(list(business_id = c("--6MefnULPED_I942VcFNA", "--9e1ONYQuAa-CB_Rrw7Tw",
"--cjBEbXMI2obtaRHNSFrA", "--cZ6Hhc9F7VkKXxHMVZSQ", "--DaPTJW3-tB1vP-PfdTEg"
), Ambience = c("romantic': False, 'intimate': False, 'classy': False, 'hipster': False, 'touristy': False, 'trendy': False, 'upscale': False, 'casual': True}",
"romantic': False, 'intimate': False, 'classy': True, 'hipster': False, 'divey': False, 'touristy': False, 'trendy': False, 'upscale': True, 'casual': False}",
"romantic': False, 'intimate': False, 'classy': False, 'hipster': False, 'divey': False, 'touristy': False, 'trendy': False, 'upscale': False, 'casual': False}",
"romantic': False, 'intimate': False, 'classy': False, 'hipster': False, 'divey': False, 'touristy': False, 'trendy': False, 'upscale': False, 'casual': True}",
"romantic': False, 'intimate': False, 'classy': False, 'hipster': False, 'touristy': False, 'trendy': False, 'upscale': False, 'casual': True}"
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))
稍微整理一下你的数据似乎就可以了。具体来说,删除'and}。也许还有一种方法可以用jsonlite解析,但我没有调查:
library(tidyverse)
samp %>%
mutate(Ambience = strsplit(str_remove_all(Ambience, '[\'|}]'), ",")) %>%
unnest() %>%
mutate_at(vars(Ambience), str_trim) %>%
separate(Ambience, into = c("key", "value")) %>%
spread(key, value)
# A tibble: 5 x 10
business_id casual classy divey hipster intimate romantic touristy trendy upscale
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 --6MefnULPED_I942VcFNA True False NA False False False False False False
2 --9e1ONYQuAa-CB_Rrw7Tw False True False False False False False False True
3 --cjBEbXMI2obtaRHNSFrA False False False False False False False False False
4 --cZ6Hhc9F7VkKXxHMVZSQ True False False False False False False False False
5 --DaPTJW3-tB1vP-PfdTEg True False NA False False False False False False
如果要假定NA为FALSE,则始终可以使用fill=FALSE参数进行扩展
library(tidyverse)
samp %>%
mutate(Ambience = strsplit(str_remove_all(Ambience, '[\'|}]'), ",")) %>%
unnest() %>%
mutate_at(vars(Ambience), str_trim) %>%
separate(Ambience, into = c("key", "value")) %>%
spread(key, value)
# A tibble: 5 x 10
business_id casual classy divey hipster intimate romantic touristy trendy upscale
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 --6MefnULPED_I942VcFNA True False NA False False False False False False
2 --9e1ONYQuAa-CB_Rrw7Tw False True False False False False False False True
3 --cjBEbXMI2obtaRHNSFrA False False False False False False False False False
4 --cZ6Hhc9F7VkKXxHMVZSQ True False False False False False False False False
5 --DaPTJW3-tB1vP-PfdTEg True False NA False False False False False False