dplyr 0.3.0.9000如何正确使用do()
试图重现SO问题的结果: 这是数据dplyr 0.3.0.9000如何正确使用do(),r,dplyr,R,Dplyr,试图重现SO问题的结果: 这是数据 person = c('Grace', 'Grace', 'Grace', 'Rob', 'Rob', 'Rob') foods = c('apple', 'banana', 'cucumber', 'spaghetti', 'cucumber', 'banana') eaten <- data.frame(person, foods, stringsAsFactors = FALSE) 产生上述结果的原始代码如下,不再有效: > eaten
person = c('Grace', 'Grace', 'Grace', 'Rob', 'Rob', 'Rob')
foods = c('apple', 'banana', 'cucumber', 'spaghetti', 'cucumber', 'banana')
eaten <- data.frame(person, foods, stringsAsFactors = FALSE)
产生上述结果的原始代码如下,不再有效:
> eaten %>% group_by(person) %>% do(function(x) combn(x$foods, m = 2))
Error: Results are not data frames at positions: 1, 2
尝试了几种使用do()函数的方法,但均无效
> eaten %>% group_by(person) %>% do(combn(.$foods, m = 2))
Error: Results are not data frames at positions: 1, 2
> eaten %>% group_by(person) %>% do(.$foods, combn, m =2)
Error: Arguments to do() must either be all named or all unnamed
> eaten %>% group_by(person) %>% do((combn(.$foods, m=2)))
Error: Results are not data frames at positions: 1, 2
不过,似乎只有下面的一条适用于警告消息:
> eaten %>% group_by(person) %>% do(as.data.frame(combn(.$foods, m = 2)))
# person V1 V2 V3
# 1 Grace apple apple banana
# 2 Grace banana cucumber cucumber
# 3 Rob spaghetti spaghetti cucumber
# 4 Rob cucumber banana banana
# Warning messages:
# 1: In rbind_all(out[[1]]) : Unequal factor levels: coercing to character
# 2: In rbind_all(out[[1]]) : Unequal factor levels: coercing to character
相信在新版本下do()的行为一定会发生变化。变化是什么?使用do()的正确习惯用法/方式是什么?谢谢
编辑:安装最新的dplyr并运行@hadley建议的代码
packageVersion(“dplyr”)
[1] ‘0.3.0.2’
(人)%%>%do(x=combn(.$foods,m=2))食用%>%
#来源:本地数据帧[2 x 2]
#小组:
#
#x人
#1格蕾丝
#2罗布
EDIT2:需要按照@hadley的建议提取列“x”
eaten2%按(人)分组%>%do(x=combn(.$foods,m=2))
eaten2[[“x”]]
# [[1]]
# [,1] [,2] [,3]
#[1,]“苹果”“苹果”“香蕉”
#[2,]“香蕉”“黄瓜”“黄瓜”
#
# [[2]]
# [,1] [,2] [,3]
#[1,]“意大利面”“意大利面”“黄瓜”
#[2,]“黄瓜”“香蕉”“香蕉”
将EDIT2移动到Q中以回答关闭问题:
对于最新的dplyr
0.3.0.2+,需要按照@hadley的建议提取列“x”
eaten2 <- eaten %>% group_by(person) %>% do(x = combn(.$foods, m = 2))
eaten2[["x"]]
# [[1]]
# [,1] [,2] [,3]
# [1,] "apple" "apple" "banana"
# [2,] "banana" "cucumber" "cucumber"
#
# [[2]]
# [,1] [,2] [,3]
# [1,] "spaghetti" "spaghetti" "cucumber"
# [2,] "cucumber" "banana" "banana
eaten2%按(人)分组%>%do(x=combn(.$foods,m=2))
eaten2[[“x”]]
# [[1]]
# [,1] [,2] [,3]
#[1,]“苹果”“苹果”“香蕉”
#[2,]“香蕉”“黄瓜”“黄瓜”
#
# [[2]]
# [,1] [,2] [,3]
#[1,]“意大利面”“意大利面”“黄瓜”
#[2]“黄瓜”“香蕉”“香蕉”
显然,这是一个偏好问题/数据的用途,但我认为上面的一种可能性对于生成一个可用的、整洁的数据框架来说是非常聪明的。使用tidyr::gather
,我觉得这会返回一个对象,它可以清楚地表明谁在哪顿饭中吃了什么,而不提取任何内容
person = c( 'Grace', 'Grace', 'Grace', 'Rob', 'Rob', 'Rob' )
foods = c( 'apple', 'banana', 'cucumber', 'spaghetti', 'cucumber', 'banana' )
eaten <- data.frame(person, foods, stringsAsFactors = FALSE)
eaten %>% group_by(person) %>% do(as.data.frame(combn(.$foods, m = 2))) %>% gather(meal, foods, -1)
person=c('Grace','Grace','Grace','Rob','Rob','Rob')
食品=c(‘苹果’、‘香蕉’、‘黄瓜’、‘意大利面条’、‘黄瓜’、‘香蕉’)
(人)食用%group_%%>%do(如.data.frame(combn(.$foods,m=2))%%>%GARGET(膳食,食品,-1)
返回
# Groups: person [2]
person meal foods
<chr> <chr> <chr>
1 Grace V1 apple
2 Grace V1 banana
3 Rob V1 spaghetti
4 Rob V1 cucumber
5 Grace V2 apple
6 Grace V2 cucumber
7 Rob V2 spaghetti
8 Rob V2 banana
9 Grace V3 banana
10 Grace V3 cucumber
11 Rob V3 cucumber
12 Rob V3 banana
>
#组:人[2]
人餐食品
1格蕾丝V1苹果
2格蕾丝V1香蕉
3个意大利面
4.黄瓜
5格蕾丝V2苹果
6格蕾丝V2黄瓜
7.意大利面
8条香蕉
9格蕾丝V3香蕉
10格蕾丝V3黄瓜
11.黄瓜
12根香蕉
>
我只在dplyr 0.2中进行了测试,并得到了关于不平等因子水平的相同警告。要消除这些警告(至少在0.2中),您可以将do
修改为:do(as.data.frame(combn(.$foods,m=2),stringsAsFactors=FALSE))
-希望它能帮助看起来非常不惯用且奇怪的将stringsAsFactors参数再次放在do()中。无论如何,尝试了。确实解决了问题。但是,想知道是否有合适的惯用用法来使用do(),以及为什么这样的行为会改变(或实际上没有改变)?您需要命名参数:Eat%>%group\u by(person)%>%do(x=combn(.$foods,m=2))
@hadley,它不起作用。@KFB提取x
列,你就会得到你想要的。使用magrittr 1.5,你也可以做吃了%>%group_by(person)%%>%do(x=combn(.$foods,m=2))%%$%x
@docendodiscimus,谢谢你的主意!
eaten2 <- eaten %>% group_by(person) %>% do(x = combn(.$foods, m = 2))
eaten2[["x"]]
# [[1]]
# [,1] [,2] [,3]
# [1,] "apple" "apple" "banana"
# [2,] "banana" "cucumber" "cucumber"
#
# [[2]]
# [,1] [,2] [,3]
# [1,] "spaghetti" "spaghetti" "cucumber"
# [2,] "cucumber" "banana" "banana
person = c( 'Grace', 'Grace', 'Grace', 'Rob', 'Rob', 'Rob' )
foods = c( 'apple', 'banana', 'cucumber', 'spaghetti', 'cucumber', 'banana' )
eaten <- data.frame(person, foods, stringsAsFactors = FALSE)
eaten %>% group_by(person) %>% do(as.data.frame(combn(.$foods, m = 2))) %>% gather(meal, foods, -1)
# Groups: person [2]
person meal foods
<chr> <chr> <chr>
1 Grace V1 apple
2 Grace V1 banana
3 Rob V1 spaghetti
4 Rob V1 cucumber
5 Grace V2 apple
6 Grace V2 cucumber
7 Rob V2 spaghetti
8 Rob V2 banana
9 Grace V3 banana
10 Grace V3 cucumber
11 Rob V3 cucumber
12 Rob V3 banana
>