R 将sqldf（）中的count distinct和tidy代码之间的差减一_R_Sqlite_Tidyr_Sqldf

R 将sqldf（）中的count distinct和tidy代码之间的差减一

r sqlite

R 将sqldf（）中的count distinct和tidy代码之间的差减一,r,sqlite,tidyr,sqldf,R,Sqlite,Tidyr,Sqldf,我试图使用sqldf包计算R中数据集中不同名称的数量，并希望使用tidy检查我的答案。我得到了一个稍有不同的答案，但不知道是什么原因造成的。这是我的密码： mayors <- read_csv(file="https://raw.githubusercontent.com/jmontgomery/jmontgomery.github.io/master/PDS/Datasets/Mayors.csv") mayorsDF <- as.data.frame(mayors) libra

我试图使用sqldf包计算R中数据集中不同名称的数量，并希望使用tidy检查我的答案。我得到了一个稍有不同的答案，但不知道是什么原因造成的。这是我的密码：

mayors <- read_csv(file="https://raw.githubusercontent.com/jmontgomery/jmontgomery.github.io/master/PDS/Datasets/Mayors.csv")
mayorsDF <- as.data.frame(mayors)

library(sqldf)
sqldf("select count(distinct FullName) from mayorsDF") # gives me 1406

allNamesDF <- sqldf("select distinct FullName from mayorsDF")
length(allNamesDF$FullName) # gives me 1407

library(tidyverse)
mayors %>% 
    select("FullName") %>%
    unique() %>%
    count() # gives me 1407

我错过了什么？我是sqldf包的新手，但对SQL并不陌生。

SQL不会在count distinct中将null计算为唯一值，并且您的数据中有null

sqldf:：sqldfselect count*为n，来自mayorsDF，其中FullName为null N 1 36

allNamesDFSQL不会在count distinct中将空值计算为唯一值，并且您的数据中有空值

sqldf:：sqldfselect count*为n，来自mayorsDF，其中FullName为null N 1 36 allNamesDF