R 警告消息:成对计数功能
我试图继续使用widyr包中的R 警告消息:成对计数功能,r,tidyverse,tidytext,R,Tidyverse,Tidytext,我试图继续使用widyr包中的pairwise\u count函数 特别是,考虑这行代码,其中数据是TiBLE,包括列“Word”和“节”: data%>%成对计数(字、节、排序=TRUE) 但是,我收到了以下警告消息: 从dplyr 0.7.0开始,不推荐使用distinct\()。请改用distinct() tbl_df()从dplyr 1.0.0开始就不推荐使用。请改用tibble::as_tibble() 我怀疑widyr包中的pairwise\u count函数使用了一些过时的函数,导
pairwise\u count
函数
特别是,考虑这行代码,其中数据是TiBLE,包括列“Word”和“节”:
data%>%成对计数(字、节、排序=TRUE)
但是,我收到了以下警告消息:
distinct\(
)。请改用distinct()
tbl_df()
从dplyr 1.0.0开始就不推荐使用。请改用tibble::as_tibble()
我怀疑widyr包中的
pairwise\u count
函数使用了一些过时的函数,导致了这些警告。tidyverse中是否有更先进的软件包或功能可供我替代?否则,有没有一种方法可以在不触发这些警告的情况下使用该函数?第4章文本挖掘的widyr
部分中的代码生成不推荐的函数消息,用于distinct_uu()
和tbl_udf()
函数的使用。由于在本书的第4章中有100多行代码,我们将其缩减到相关部分,以及复制警告消息所需的最小包数
library(dplyr)
library(janeaustenr)
library(tidytext)
austen_section_words <- austen_books() %>%
filter(book == "Pride & Prejudice") %>%
mutate(section = row_number() %/% 10) %>%
filter(section > 0) %>%
unnest_tokens(word, text) %>%
filter(!word %in% stop_words$word)
austen_section_words
library(widyr)
# count words co-occuring within sections
word_pairs <- austen_section_words %>%
pairwise_count(word, section, sort = TRUE)
word_pairs
使用lifecycle::last\u warnings()
打印警告消息时,我们可以看到警告的来源
…以及输出:
> suppressWarnings(
+ # count words co-occuring within sections
+ word_pairs <- austen_section_words %>%
+ pairwise_count(word, section, sort = TRUE))
>
> word_pairs
# A tibble: 796,008 x 3
item1 item2 n
<chr> <chr> <dbl>
1 darcy elizabeth 144
2 elizabeth darcy 144
3 miss elizabeth 110
4 elizabeth miss 110
5 elizabeth jane 106
6 jane elizabeth 106
7 miss darcy 92
8 darcy miss 92
9 elizabeth bingley 91
10 bingley elizabeth 91
# … with 795,998 more rows
如果您将数据和代码包含在这篇文章中,而不是将人们发送到另一个网站,这将非常有帮助。@Ronaksha我已经用相关的代码行更新了这个问题。@Ronaksha的意思是您应该在问题中包含一个,而不仅仅是生成警告消息的代码行。也就是说,我的答案中包含了一个。请参阅@phiver中Julia Silge的评论-如果我正确理解github问题,链接问题指的是试图为数据中未找到的变量计算
distinct()
时出现的错误,该错误由tidytext 0.1.9.9解决。截至2020年9月20日,github中的pairwise_count()
的最新源代码仍然使用distinct_()
,如我在回答中所述,因此对widyr
开发版本的更新不会消除警告消息。正确。Julia说您仍然会看到关于distinct\uz()
的警告。Tidyverse软件包很好,但是有太多的相互依赖性:-(。
#' @rdname pairwise_count
#' @export
pairwise_count_ <- function(tbl, item, feature, wt = NULL, ...) {
if (is.null(wt)) {
func <- squarely_(function(m) m %*% t(m), sparse = TRUE, ...)
wt <- "..value"
} else {
func <- squarely_(function(m) m %*% t(m > 0), sparse = TRUE, ...)
}
tbl %>%
distinct_(.dots = c(item, feature), .keep_all = TRUE) %>%
mutate(..value = 1) %>%
func(item, feature, wt) %>%
rename(n = value)
}
<deprecated>
message: `tbl_df()` is deprecated as of dplyr 1.0.0.
Please use `tibble::as_tibble()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.
backtrace:
9. widyr::pairwise_count(., word, section, sort = TRUE)
10. widyr::pairwise_count_(...)
3. dplyr::distinct_(., .dots = c(item, feature), .keep_all = TRUE)
3. dplyr::mutate(., ..value = 1)
10. widyr:::func(., item, feature, wt)
19. widyr:::new_f(tbl, item, feature, value, ...)
7. widyr:::custom_melt(.)
15. dplyr::tbl_df(.)
>
library(widyr)
suppressWarnings(
# count words co-occuring within sections
word_pairs <- austen_section_words %>%
pairwise_count(word, section, sort = TRUE))
> suppressWarnings(
+ # count words co-occuring within sections
+ word_pairs <- austen_section_words %>%
+ pairwise_count(word, section, sort = TRUE))
>
> word_pairs
# A tibble: 796,008 x 3
item1 item2 n
<chr> <chr> <dbl>
1 darcy elizabeth 144
2 elizabeth darcy 144
3 miss elizabeth 110
4 elizabeth miss 110
5 elizabeth jane 106
6 jane elizabeth 106
7 miss darcy 92
8 darcy miss 92
9 elizabeth bingley 91
10 bingley elizabeth 91
# … with 795,998 more rows
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] tidytext_0.2.5 janeaustenr_0.1.5 widyr_0.1.3 tidyr_1.1.1
[5] dplyr_1.0.2
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 rstudioapi_0.11 magrittr_1.5 tidyselect_1.1.0
[5] lattice_0.20-41 R6_2.4.1 rlang_0.4.7 fansi_0.4.1
[9] stringr_1.4.0 tools_4.0.2 grid_4.0.2 packrat_0.5.0
[13] broom_0.7.0 utf8_1.1.4 cli_2.0.2 ellipsis_0.3.1
[17] assertthat_0.2.1 tibble_3.0.3 lifecycle_0.2.0 crayon_1.3.4
[21] Matrix_1.2-18 purrr_0.3.4 vctrs_0.3.2 tokenizers_0.2.1
[25] SnowballC_0.7.0 glue_1.4.1 stringi_1.4.6 compiler_4.0.2
[29] pillar_1.4.6 generics_0.0.2 backports_1.1.8 pkgconfig_2.0.3