tidyverse dplyr总结未按预期运行
我们正在分析SQL Server环境中的列。我们正在提取列名和数据类型。然后,我们运行一个简单的管道参数,看看在不同的表中是否有相同列名的混合数据类型tidyverse dplyr总结未按预期运行,r,dplyr,tidyverse,R,Dplyr,Tidyverse,我们正在分析SQL Server环境中的列。我们正在提取列名和数据类型。然后,我们运行一个简单的管道参数,看看在不同的表中是否有相同列名的混合数据类型 library(tidyverse) DF = data.frame(COLUMN_NAME = c("PARTYID","PARTYID","AGE","AGE","SALESID","SALES"), DATA_TYPE = c("char","tinyint","int","smallint","varch
library(tidyverse)
DF = data.frame(COLUMN_NAME = c("PARTYID","PARTYID","AGE","AGE","SALESID","SALES"),
DATA_TYPE = c("char","tinyint","int","smallint","varchar","numeric"))
DF %>% group_by(COLUMN_NAME) %>%
summarise(mixedTypes = (any(grepl("char", DATA_TYPE)) &
!(all(grepl("char", DATA_TYPE)))))
我回来的只是
mixedTypes
1 TRUE
但我认为我应该返回data.frame的一个子集,包括两个列以及一个名为mixedTypes
的新列
更新:有人建议使用冲突
,我没有受过足够的教育来理解如何解释detail=TRUE
输出:
$.GlobalEnv
[1] "df"
$`package:forcats`
[1] "%>%" "%>%" "%>%" "%>%" "%>%"
$`package:purrr`
[1] "%>%" "%>%" "compact" "%>%" "%>%" "set_names" "%>%"
$`package:tidyr`
[1] "%>%" "%>%" "%>%" "%>%" "extract" "%>%"
$`package:plyr`
[1] "compact" "arrange" "count" "desc" "failwith" "id" "mutate" "rename" "summarise"
[10] "summarize" "is.discrete" "summarize"
$`package:stringr`
[1] "%>%" "%>%" "%>%" "%>%" "%>%"
$`package:tibble`
[1] "add_row" "as_data_frame" "as_tibble" "data_frame" "data_frame_" "frame_data" "glimpse" "lst"
[9] "lst_" "tbl_sum" "tibble" "tribble" "trunc_mat" "type_sum"
$`package:magrittr`
[1] "%>%" "%>%" "%>%" "%>%" "extract" "set_names" "%>%"
$`package:dplyr`
[1] "%>%" "%>%" "%>%" "%>%" "%>%" "add_row" "arrange" "as_data_frame"
[9] "as_tibble" "count" "data_frame" "data_frame_" "desc" "failwith" "frame_data" "glimpse"
[17] "id" "lst" "lst_" "mutate" "rename" "summarise" "summarize" "tbl_sum"
[25] "tibble" "tribble" "trunc_mat" "type_sum" "src" "summarize" "coalesce" "filter"
[33] "lag" "intersect" "setdiff" "setequal" "union"
$`package:Hmisc`
[1] "summarize" "is.discrete" "src" "summarize" "format.pval" "units"
$`package:ggplot2`
[1] "Position"
$`package:MyPackage`
[1] "coalesce" "HeatMap"
$`package:stats`
[1] "df" "filter" "lag"
$`package:methods`
[1] "body<-" "kronecker"
$`package:base`
[1] "body<-" "format.pval" "HeatMap" "intersect" "kronecker" "Position" "setdiff" "setequal" "union"
[10] "units"
$.GlobalEnv
[1] “df”
$`package:forcats`
[1] "%>%" "%>%" "%>%" "%>%" "%>%"
$`package:purrr`
[1] “%$>%”“%$>%”压缩“%$>%”“%$>%””集合名称“%$>%”
$`package:tidyr`
[1] “%$>%”“%$>%”“%$>%”“%$>%””提取“%$>%”
$`package:plyr`
[1] “压缩”“排列”“计数”“描述”“失败与”“id”“变异”“重命名”“摘要”
[10] “摘要”是。离散的“摘要”
$`package:stringr`
[1] "%>%" "%>%" "%>%" "%>%" "%>%"
$`package:tibble`
[1] “添加行”“作为数据帧”“作为可存储”“数据帧”“数据帧”“帧数据”“预览”“lst”
[9] 第一个“tbl”和“TIBLE”和“tribble”和“trunc”和“type”和
$`package:magrittr`
[1] “%$>%”“%$>%”“%$>%”“%$>%””提取“集合名称”%%>%”
$`package:dplyr`
[1] “%%>%”“%%>%”“%%>%”“%%>%”“%%>%”“添加行”“排列”“作为数据帧”
[9] “作为可匹配的”“计数”“数据帧”“数据帧”“描述”“故障与”“帧数据”“一瞥”
[17] id“lst”lst“mutate”“rename”“summary”“summary”“tbl\u sum”
[25]“tibble”“tribble”“trunc_mat”“type_sum”“src”“summary”“coalesce”“filter”
[33]“滞后”“相交”“setdiff”“setequal”“并集”
$`package:Hmisc`
[1] “summary”是.discrete“src”summary“format.pval”单位
$`package:ggplot2`
[1] “职位”
$`package:MyPackage`
[1] 合并“热图”
$`package:stats`
[1] “df”“过滤器”“滞后”
$`包:方法`
[1] “body正如评论中所说,问题在于plyr
版本的summary
是在dplyr
之后加载的,所以当你调用summary
时,你得到的是错误的。你应该先尝试加载plyr
(或者更好,尽量不要加载它),但您也可以通过明确指定所需的摘要
版本来确保安全
library(tidyverse)
DF = data.frame(COLUMN_NAME = c("PARTYID","PARTYID","AGE","AGE","SALESID","SALES"),
DATA_TYPE = c("char","tinyint","int","smallint","varchar","numeric"))
# bad:
DF %>% group_by(COLUMN_NAME) %>%
plyr::summarise(mixedTypes = (any(grepl("char", DATA_TYPE)) &
!(all(grepl("char", DATA_TYPE)))))
# good:
DF %>% group_by(COLUMN_NAME) %>%
dplyr::summarise(mixedTypes = (any(grepl("char", DATA_TYPE)) &
!(all(grepl("char", DATA_TYPE)))))
如果你真的需要加载plyr
以及dplyr
,那么这样做是个好主意,同时也需要处理其他关键冲突,比如mutate
。但是最好避免同时加载这两个。你使用的是什么版本的dplyr?如果我复制/粘贴上面的代码,那不是我得到的结果。你有没有在新的R会话中执行此操作?您的冲突()中有任何可疑之处吗?
?您是对的。新的R会话很好。我不知道如何解释冲突()
输出…有多个管道(%>%
)在猫的forcats
、purrr
、tidyr
、stringr
、magrittr
、和dplyr
中,您都加载了package:plyr
和package:dplyr
,这通常不是一个好主意。但如果您确实需要这两个,请确保在dplyr
之前加载plyr
在dplyr之前加载plyr?如果是,则可能是重复的。