R 如何对字符进行分组
我有一个有两列的数据框。一个是groupname,另一个包含组的值,如下所示。实际的列表要长得多R 如何对字符进行分组,r,R,我有一个有两列的数据框。一个是groupname,另一个包含组的值,如下所示。实际的列表要长得多 MyGroup hello MyGroup goodbye MyGroup bonjour YourGroup red YourGroup blue YourGroup green 我想创建一个输出,如下所示 Mygroup<-c("hello","goodbye","bonjour") YourGroup<-c("red","blue","green") Mygroup
MyGroup hello
MyGroup goodbye
MyGroup bonjour
YourGroup red
YourGroup blue
YourGroup green
我想创建一个输出,如下所示
Mygroup<-c("hello","goodbye","bonjour")
YourGroup<-c("red","blue","green")
Mygroup您可以从tidyr
使用nest
:
library(tidyverse)
data_out <- data %>%
group_by(groups) %>%
nest()
库(tidyverse)
数据输出%
分组依据(组)%>%
nest()
您可以通过以下方式访问您的组:
data_out$data
#[[1]]
# A tibble: 3 x 1
# words
# <fct>
#1 hello
#2 goodbye
#3 bonjour
#[[2]]
## A tibble: 3 x 1
# words
# <fct>
#1 red
#2 blue
#3 green
data\u out$data
#[[1]]
#一个tibble:3x1
#言语
#
#你好
#2再见
#3你好
#[[2]]
##一个tibble:3x1
#言语
#
#1红色
#2蓝色
#3绿色
您可以使用split
功能根据组对数据进行拆分
,然后使用unique
获取唯一字符串或值的列表
splited <- split(df, f = data$groups)
unique(splited$MyGroup$words)
unique(splited$YourGroup$words)
#> splited <- split(df, f = data$groups)
#> unique(splited$MyGroup$words)
#[1] hello bonjour
#Levels: blue bonjour green hello red
#> unique(splited$YourGroup$words)
#[1] red blue green
#Levels: blue bonjour green hello red
您可以通过简单的lappy
实现这一点
# Your data frame. 'stringsAsFactors = FALSE' is used for the sake of making it
# more generic
df <- data.frame(
x = c(rep("MyGroup", 3), rep("YourGroup", 3)),
y = c("hello", "goodbye", "bonjour", "red", "blue", "green"),
stringsAsFactors = FALSE)
# Makes a list for each group in column 1. The answer could be this line only
res <- lapply(unique(df[,1]), function(x) df[df[,1] == x, 2])
# Setting the names accordingly, for convenience
names(res) <- unique(df[,1])
print(res)
#您的数据帧。”stringsAsFactors=FALSE'用于使
#更一般
df您可以使用拆分:
数据
df1 <- read.table(header=FALSE,stringsAsFactors=FALSE,text="
MyGroup hello
MyGroup goodbye
MyGroup bonjour
YourGroup red
YourGroup blue
YourGroup green")
在当前环境中分配给2个变量
list2env(split(df1$V2,df1$V1),envir=environment())
MyGroup
# [1] "hello" "goodbye" "bonjour"
YourGroup
# [1] "red" "blue" "green"
请阅读
split(df1$V2,df1$V1)
# $MyGroup
# [1] "hello" "goodbye" "bonjour"
#
# $YourGroup
# [1] "red" "blue" "green"
list2env(split(df1$V2,df1$V1),envir=environment())
MyGroup
# [1] "hello" "goodbye" "bonjour"
YourGroup
# [1] "red" "blue" "green"