dplyr 0.3.0.9000如何正确使用do()

dplyr 0.3.0.9000如何正确使用do(),r,dplyr,R,Dplyr,试图重现SO问题的结果: 这是数据 person = c('Grace', 'Grace', 'Grace', 'Rob', 'Rob', 'Rob') foods = c('apple', 'banana', 'cucumber', 'spaghetti', 'cucumber', 'banana') eaten <- data.frame(person, foods, stringsAsFactors = FALSE) 产生上述结果的原始代码如下,不再有效: > eaten

试图重现SO问题的结果:

这是数据

person = c('Grace', 'Grace', 'Grace', 'Rob', 'Rob', 'Rob')
foods = c('apple', 'banana', 'cucumber', 'spaghetti', 'cucumber', 'banana')
eaten <- data.frame(person, foods, stringsAsFactors = FALSE)
产生上述结果的原始代码如下,不再有效:

> eaten %>% group_by(person) %>% do(function(x) combn(x$foods, m = 2))
Error: Results are not data frames at positions: 1, 2
尝试了几种使用do()函数的方法,但均无效

> eaten %>% group_by(person) %>% do(combn(.$foods, m = 2))
Error: Results are not data frames at positions: 1, 2

> eaten %>% group_by(person) %>% do(.$foods, combn, m =2)
Error: Arguments to do() must either be all named or all unnamed

> eaten %>% group_by(person) %>% do((combn(.$foods, m=2)))
Error: Results are not data frames at positions: 1, 2
不过,似乎只有下面的一条适用于警告消息:

> eaten %>% group_by(person) %>% do(as.data.frame(combn(.$foods, m = 2)))
#   person        V1        V2       V3
# 1  Grace     apple     apple   banana
# 2  Grace    banana  cucumber cucumber
# 3    Rob spaghetti spaghetti cucumber
# 4    Rob  cucumber    banana   banana
# Warning messages:
# 1: In rbind_all(out[[1]]) : Unequal factor levels: coercing to character
# 2: In rbind_all(out[[1]]) : Unequal factor levels: coercing to character
相信在新版本下do()的行为一定会发生变化。变化是什么?使用do()的正确习惯用法/方式是什么?谢谢

编辑:安装最新的dplyr并运行@hadley建议的代码
packageVersion(“dplyr”)
[1] ‘0.3.0.2’
(人)%%>%do(x=combn(.$foods,m=2))食用%>%
#来源:本地数据帧[2 x 2]
#小组:
#   
#x人
#1格蕾丝
#2罗布
EDIT2:需要按照@hadley的建议提取列“x”
eaten2%按(人)分组%>%do(x=combn(.$foods,m=2))
eaten2[[“x”]]
# [[1]]
# [,1]     [,2]       [,3]      
#[1,]“苹果”“苹果”“香蕉”
#[2,]“香蕉”“黄瓜”“黄瓜”
# 
# [[2]]
# [,1]        [,2]        [,3]      
#[1,]“意大利面”“意大利面”“黄瓜”
#[2,]“黄瓜”“香蕉”“香蕉”

将EDIT2移动到Q中以回答关闭问题:

对于最新的
dplyr
0.3.0.2+,需要按照@hadley的建议提取列“x”

eaten2 <- eaten %>% group_by(person) %>% do(x = combn(.$foods, m = 2))
eaten2[["x"]]
# [[1]]
# [,1]     [,2]       [,3]      
# [1,] "apple"  "apple"    "banana"  
# [2,] "banana" "cucumber" "cucumber"
# 
# [[2]]
# [,1]        [,2]        [,3]      
# [1,] "spaghetti" "spaghetti" "cucumber"
# [2,] "cucumber"  "banana"    "banana
eaten2%按(人)分组%>%do(x=combn(.$foods,m=2))
eaten2[[“x”]]
# [[1]]
# [,1]     [,2]       [,3]      
#[1,]“苹果”“苹果”“香蕉”
#[2,]“香蕉”“黄瓜”“黄瓜”
# 
# [[2]]
# [,1]        [,2]        [,3]      
#[1,]“意大利面”“意大利面”“黄瓜”
#[2]“黄瓜”“香蕉”“香蕉”

显然,这是一个偏好问题/数据的用途,但我认为上面的一种可能性对于生成一个可用的、整洁的数据框架来说是非常聪明的。使用
tidyr::gather
,我觉得这会返回一个对象,它可以清楚地表明谁在哪顿饭中吃了什么,而不提取任何内容

person = c( 'Grace', 'Grace', 'Grace', 'Rob', 'Rob', 'Rob' )
foods   = c( 'apple', 'banana', 'cucumber', 'spaghetti', 'cucumber', 'banana' )
eaten <- data.frame(person, foods, stringsAsFactors = FALSE)
eaten %>% group_by(person) %>% do(as.data.frame(combn(.$foods, m = 2))) %>% gather(meal, foods, -1)
person=c('Grace','Grace','Grace','Rob','Rob','Rob')
食品=c(‘苹果’、‘香蕉’、‘黄瓜’、‘意大利面条’、‘黄瓜’、‘香蕉’)
(人)食用%group_%%>%do(如.data.frame(combn(.$foods,m=2))%%>%GARGET(膳食,食品,-1)
返回

# Groups:   person [2]
   person meal  foods    
   <chr>  <chr> <chr>    
 1 Grace  V1    apple    
 2 Grace  V1    banana   
 3 Rob    V1    spaghetti
 4 Rob    V1    cucumber 
 5 Grace  V2    apple    
 6 Grace  V2    cucumber 
 7 Rob    V2    spaghetti
 8 Rob    V2    banana   
 9 Grace  V3    banana   
10 Grace  V3    cucumber 
11 Rob    V3    cucumber 
12 Rob    V3    banana   
> 
#组:人[2]
人餐食品
1格蕾丝V1苹果
2格蕾丝V1香蕉
3个意大利面
4.黄瓜
5格蕾丝V2苹果
6格蕾丝V2黄瓜
7.意大利面
8条香蕉
9格蕾丝V3香蕉
10格蕾丝V3黄瓜
11.黄瓜
12根香蕉
> 

我只在dplyr 0.2中进行了测试,并得到了关于不平等因子水平的相同警告。要消除这些警告(至少在0.2中),您可以将
do
修改为:
do(as.data.frame(combn(.$foods,m=2),stringsAsFactors=FALSE))
-希望它能帮助看起来非常不惯用且奇怪的将stringsAsFactors参数再次放在do()中。无论如何,尝试了。确实解决了问题。但是,想知道是否有合适的惯用用法来使用do(),以及为什么这样的行为会改变(或实际上没有改变)?您需要命名参数:
Eat%>%group\u by(person)%>%do(x=combn(.$foods,m=2))
@hadley,它不起作用。@KFB提取
x
列,你就会得到你想要的。使用magrittr 1.5,你也可以做
吃了%>%group_by(person)%%>%do(x=combn(.$foods,m=2))%%$%x
@docendodiscimus,谢谢你的主意!
eaten2 <- eaten %>% group_by(person) %>% do(x = combn(.$foods, m = 2))
eaten2[["x"]]
# [[1]]
# [,1]     [,2]       [,3]      
# [1,] "apple"  "apple"    "banana"  
# [2,] "banana" "cucumber" "cucumber"
# 
# [[2]]
# [,1]        [,2]        [,3]      
# [1,] "spaghetti" "spaghetti" "cucumber"
# [2,] "cucumber"  "banana"    "banana
person = c( 'Grace', 'Grace', 'Grace', 'Rob', 'Rob', 'Rob' )
foods   = c( 'apple', 'banana', 'cucumber', 'spaghetti', 'cucumber', 'banana' )
eaten <- data.frame(person, foods, stringsAsFactors = FALSE)
eaten %>% group_by(person) %>% do(as.data.frame(combn(.$foods, m = 2))) %>% gather(meal, foods, -1)
# Groups:   person [2]
   person meal  foods    
   <chr>  <chr> <chr>    
 1 Grace  V1    apple    
 2 Grace  V1    banana   
 3 Rob    V1    spaghetti
 4 Rob    V1    cucumber 
 5 Grace  V2    apple    
 6 Grace  V2    cucumber 
 7 Rob    V2    spaghetti
 8 Rob    V2    banana   
 9 Grace  V3    banana   
10 Grace  V3    cucumber 
11 Rob    V3    cucumber 
12 Rob    V3    banana   
>