R 如何对字符进行分组

R 如何对字符进行分组,r,R,我有一个有两列的数据框。一个是groupname,另一个包含组的值,如下所示。实际的列表要长得多 MyGroup hello MyGroup goodbye MyGroup bonjour YourGroup red YourGroup blue YourGroup green 我想创建一个输出,如下所示 Mygroup<-c("hello","goodbye","bonjour") YourGroup<-c("red","blue","green") Mygroup

我有一个有两列的数据框。一个是groupname,另一个包含组的值,如下所示。实际的列表要长得多

MyGroup   hello
MyGroup   goodbye
MyGroup   bonjour
YourGroup red
YourGroup blue
YourGroup green
我想创建一个输出,如下所示

Mygroup<-c("hello","goodbye","bonjour")
YourGroup<-c("red","blue","green")

Mygroup您可以从
tidyr
使用
nest

library(tidyverse)

data_out <- data %>% 
  group_by(groups) %>% 
  nest()
库(tidyverse)
数据输出%
分组依据(组)%>%
nest()
您可以通过以下方式访问您的组:

data_out$data

#[[1]]
# A tibble: 3 x 1
#  words  
#  <fct>  
#1 hello  
#2 goodbye
#3 bonjour

#[[2]]
## A tibble: 3 x 1
#  words
#  <fct>
#1 red  
#2 blue 
#3 green
data\u out$data
#[[1]]
#一个tibble:3x1
#言语
#    
#你好
#2再见
#3你好
#[[2]]
##一个tibble:3x1
#言语
#  
#1红色
#2蓝色
#3绿色

您可以使用
split
功能根据
组对数据进行拆分
,然后使用
unique
获取唯一字符串或值的列表

splited <- split(df, f = data$groups)
unique(splited$MyGroup$words)
unique(splited$YourGroup$words)

#> splited <- split(df, f = data$groups)
#> unique(splited$MyGroup$words)
#[1] hello   bonjour
#Levels: blue bonjour green hello red
#> unique(splited$YourGroup$words)
#[1] red   blue  green
#Levels: blue bonjour green hello red

您可以通过简单的
lappy
实现这一点

# Your data frame. 'stringsAsFactors = FALSE' is used for the sake of making it
# more generic
df <- data.frame(
    x = c(rep("MyGroup", 3), rep("YourGroup", 3)),
    y = c("hello", "goodbye", "bonjour", "red", "blue", "green"),
    stringsAsFactors = FALSE)

# Makes a list for each group in column 1. The answer could be this line only
res <- lapply(unique(df[,1]), function(x) df[df[,1] == x, 2])
# Setting the names accordingly, for convenience
names(res) <- unique(df[,1])
print(res)
#您的数据帧。”stringsAsFactors=FALSE'用于使
#更一般
df您可以使用拆分:

数据

df1 <- read.table(header=FALSE,stringsAsFactors=FALSE,text="
MyGroup   hello
MyGroup   goodbye
MyGroup   bonjour
YourGroup red
YourGroup blue
YourGroup green")
在当前环境中分配给2个变量

list2env(split(df1$V2,df1$V1),envir=environment())

MyGroup
# [1] "hello"   "goodbye" "bonjour"

YourGroup
# [1] "red"   "blue"  "green"
请阅读
split(df1$V2,df1$V1)
# $MyGroup
# [1] "hello"   "goodbye" "bonjour"
# 
# $YourGroup
# [1] "red"   "blue"  "green"
list2env(split(df1$V2,df1$V1),envir=environment())

MyGroup
# [1] "hello"   "goodbye" "bonjour"

YourGroup
# [1] "red"   "blue"  "green"