R 连接到本地时出错
我正在尝试从本地环境运行Sparkyr,以复制生产环境。然而,我甚至不能开始。我使用Spark_install()成功安装了Spark的最新版本,但在尝试运行Spark_connect()时,出现了一个模糊且毫无帮助的错误R 连接到本地时出错,r,sparklyr,R,Sparklyr,我正在尝试从本地环境运行Sparkyr,以复制生产环境。然而,我甚至不能开始。我使用Spark_install()成功安装了Spark的最新版本,但在尝试运行Spark_connect()时,出现了一个模糊且毫无帮助的错误 > library(sparklyr) > spark_installed_versions() spark hadoop dir
> library(sparklyr)
> spark_installed_versions()
spark hadoop dir
1 2.3.1 2.7 C:\\Users\\...\\AppData\\Local/spark/spark-2.3.1-bin-hadoop2.7
> spark_connect(master = "local")
Error in if (is.na(a)) return(-1L) : argument is of length zero
下面是我的会话信息
> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] sparklyr_0.8.4.9003
loaded via a namespace (and not attached):
[1] Rcpp_0.12.17 dbplyr_1.2.1 compiler_3.5.0 pillar_1.2.3 later_0.7.3
[6] plyr_1.8.4 bindr_0.1.1 base64enc_0.1-3 tools_3.5.0 digest_0.6.15
[11] jsonlite_1.5 tibble_1.4.2 nlme_3.1-137 lattice_0.20-35 pkgconfig_2.0.1
[16] rlang_0.2.1 psych_1.8.4 shiny_1.1.0 DBI_1.0.0 rstudioapi_0.7
[21] yaml_2.1.19 parallel_3.5.0 bindrcpp_0.2.2 stringr_1.3.1 dplyr_0.7.5
[26] httr_1.3.1 rappdirs_0.3.1 rprojroot_1.3-2 grid_3.5.0 tidyselect_0.2.4
[31] glue_1.2.0 R6_2.2.2 foreign_0.8-70 reshape2_1.4.3 purrr_0.2.5
[36] tidyr_0.8.1 magrittr_1.5 backports_1.1.2 promises_1.0.1 htmltools_0.3.6
[41] assertthat_0.2.0 mnormt_1.5-5 mime_0.5 xtable_1.8-2 httpuv_1.4.3
[46] config_0.3 stringi_1.1.7 lazyeval_0.2.1 broom_0.4.4
嗯,经过一点猜测,我终于解决了我的问题。我必须手动指定“SPARK_HOME”环境
spark_installed_versions()[1, 3] %>% spark_home_set()