在R中取消列表并连接
我希望在在R中取消列表并连接,r,tidyverse,R,Tidyverse,我希望在tible中取消嵌套(展平?)并连接文本字符串(逗号分隔)。示例数据: library(tidyverse) tibble(person = c("Alice", "Bob", "Mary"), score = list(c("Red", "Green", "Blue"), c("Orange", "Green", "Yellow"), "Blue")) # A tibble: 3 x 2 person score <chr> &l
tible
中取消嵌套(展平?)并连接文本字符串(逗号分隔)。示例数据:
library(tidyverse)
tibble(person = c("Alice", "Bob", "Mary"),
score = list(c("Red", "Green", "Blue"), c("Orange", "Green", "Yellow"), "Blue"))
# A tibble: 3 x 2
person score
<chr> <list>
1 Alice <chr [3]>
2 Bob <chr [3]>
3 Mary <chr [1]>
库(tidyverse)
tibble(person=c(“爱丽丝”、“鲍勃”、“玛丽”),
分数=列表(c(“红色”、“绿色”、“蓝色”)、c(“橙色”、“绿色”、“黄色”)、“蓝色”))
#一个tibble:3x2
个人得分
1爱丽丝
2鲍勃
玛丽
预期产出:
tibble(person = c("Alice", "Bob", "Mary"),
score = c("Red, Green, Blue", "Orange, Green, Yellow", "Blue" ))
# A tibble: 3 x 2
person score
<chr> <chr>
1 Alice Red, Green, Blue
2 Bob Orange, Green, Yellow
3 Mary Blue
tibble(person=c(“爱丽丝”、“鲍勃”、“玛丽”),
分数=c(“红、绿、蓝、橙、绿、黄、蓝”))
#一个tibble:3x2
个人得分
爱丽丝红,绿,蓝
2.橙色、绿色、黄色
玛丽蓝
我怀疑有一个非常简洁的tidyverse
解决方案,但经过大量搜索,我一直无法找到答案;我怀疑我使用了错误的搜索词(unnest/concatentate)。
首选
tidyverse
解决方案。谢谢。一个简单的方法是取消长格式的数据测试,然后按组折叠
library(dplyr)
df %>%
tidyr::unnest(score) %>%
group_by(person) %>%
summarise(score = toString(score))
# person score
# <chr> <chr>
#1 Alice Red, Green, Blue
#2 Bob Orange, Green, Yellow
#3 Mary Blue
你可以做:
library(dplyr)
library(purrr)
df %>%
mutate(score = map_chr(score, toString))
# A tibble: 3 x 2
person score
<chr> <chr>
1 Alice Red, Green, Blue
2 Bob Orange, Green, Yellow
3 Mary Blue
df <- tibble(person = c("Alice", "Bob", "Mary"),
score1 = list(c("Red", "Green", "Blue"), c("Orange", "Green", "Yellow"), "Blue"),
score2 = rev(list(c("Red", "Green", "Blue"), c("Orange", "Green", "Yellow"), "Blue")))
df %>%
mutate_if(is.list, ~ map_chr(.x, toString))
# A tibble: 3 x 3
person score1 score2
<chr> <chr> <chr>
1 Alice Red, Green, Blue Blue
2 Bob Orange, Green, Yellow Orange, Green, Yellow
3 Mary Blue Red, Green, Blue
库(dplyr)
图书馆(purrr)
df%>%
变异(分数=映射(分数,toString))
#一个tibble:3x2
个人得分
爱丽丝红,绿,蓝
2.橙色、绿色、黄色
玛丽蓝
如果有多个列表列,则可以执行以下操作:
library(dplyr)
library(purrr)
df %>%
mutate(score = map_chr(score, toString))
# A tibble: 3 x 2
person score
<chr> <chr>
1 Alice Red, Green, Blue
2 Bob Orange, Green, Yellow
3 Mary Blue
df <- tibble(person = c("Alice", "Bob", "Mary"),
score1 = list(c("Red", "Green", "Blue"), c("Orange", "Green", "Yellow"), "Blue"),
score2 = rev(list(c("Red", "Green", "Blue"), c("Orange", "Green", "Yellow"), "Blue")))
df %>%
mutate_if(is.list, ~ map_chr(.x, toString))
# A tibble: 3 x 3
person score1 score2
<chr> <chr> <chr>
1 Alice Red, Green, Blue Blue
2 Bob Orange, Green, Yellow Orange, Green, Yellow
3 Mary Blue Red, Green, Blue
df%
变异_if(is.list,~map_chr(.x,toString))
#一个tibble:3x3
个人得分1分2分
1爱丽丝红,绿,蓝
橙色,绿色,黄色橙色,绿色,黄色
玛丽蓝,红,绿,蓝
基本R解决方案1:
df$score <- sapply(df$score, toString)
df$score这是一种适用于(最近)Tidyverse的一刀切方法,它不会影响分组:
df %>% mutate(across(where(is.list), ~ sapply(., toString)))
感谢您的快速响应,这看起来不错!也许我应该把这变成一个全新的问题,但是;我怎样才能将上面的方法推广到unnest()
中所有列的tibble
哪些是列表?e、 g.如果我的df是100列宽,12列是列表,我如何才能拉出这些列并将它们发送到toString()
?@Simon在这种情况下,我认为rowwise
解决方案最好使用mutate\u at
,选择vars
中的所有列,并应用函数df%>%rowwise()%>%mutate\u at(vars)(以('score')开始),toString)
是否有一种方法可以只选择列表中的列?在大型df中,可能很难提前知道哪些列是列表(不需要手动搜索每个列并做注释)。使用以开始时假定您提前知道所有列表列的名称。谢谢!@Simon yes,如果,df%>%rowwise()%%>%mutate\u if(is.list,toString)
df %>% mutate(across(where(is.list), ~ sapply(., toString)))