当计划R脚本带有气流时出现分段错误

当计划R脚本带有气流时出现分段错误,r,airflow,R,Airflow,我使用Airflow(docker容器)运行R脚本。我得到以下错误 [2019-09-19 07:03:26,500] {{bash_operator.py:127}} INFO - *** caught segfault *** [2019-09-19 07:03:26,500] {{bash_operator.py:127}} INFO - address 0x55cf00000000, cause 'memory not mapped' [2019-09-19 07:03:26,501]

我使用Airflow(docker容器)运行R脚本。我得到以下错误

[2019-09-19 07:03:26,500] {{bash_operator.py:127}} INFO -  *** caught segfault ***
[2019-09-19 07:03:26,500] {{bash_operator.py:127}} INFO - address 0x55cf00000000, cause 'memory not mapped'
[2019-09-19 07:03:26,501] {{bash_operator.py:127}} INFO - 
[2019-09-19 07:03:26,501] {{bash_operator.py:127}} INFO - Traceback:
[2019-09-19 07:03:26,501] {{bash_operator.py:127}} INFO -  1: is.data.frame(x)
[2019-09-19 07:03:26,502] {{bash_operator.py:127}} INFO -  2: FUN(X[[i]], ...)
[2019-09-19 07:03:26,502] {{bash_operator.py:127}} INFO -  3: lapply(.x, .f, ...)
[2019-09-19 07:03:26,502] {{bash_operator.py:127}} INFO -  4: map(result, subset_rows, i)
[2019-09-19 07:03:26,502] {{bash_operator.py:127}} INFO -  5: `[.tbl_df`(x, ind, , drop = FALSE)
[2019-09-19 07:03:26,503] {{bash_operator.py:127}} INFO -  6: x[ind, , drop = FALSE]
[2019-09-19 07:03:26,503] {{bash_operator.py:127}} INFO -  7: FUN(X[[i]], ...)
[2019-09-19 07:03:26,504] {{bash_operator.py:127}} INFO -  8: lapply(split(x = seq_len(nrow(x)), f = f, drop = drop, ...),     function(ind) x[ind, , drop = FALSE])
[2019-09-19 07:03:26,505] {{bash_operator.py:127}} INFO -  9: split.data.frame(es6, (0:(nrow(es6) - 1)%/%50))
[2019-09-19 07:03:26,505] {{bash_operator.py:127}} INFO - 10: split(es6, (0:(nrow(es6) - 1)%/%50))
[2019-09-19 07:03:26,506] {{bash_operator.py:127}} INFO - An irrecoverable exception occurred. R is aborting now ...
[2019-09-19 07:03:27,087] {{bash_operator.py:127}} INFO - /tmp/airflowtmpuj8lcw3e/web_etl_bf_10_days7wpo7bvb: line 1:  1140 Segmentation fault      (core dumped) Rscript /usr/local/airflow/dags/scripts/r/etl_web_api_by_create_time.R -d "2019-09-05 00:00:00+00:00"
[2019-09-19 07:03:27,088] {{bash_operator.py:131}} INFO - Command exited with return code 139
有错误的代码是
split(es6,(0:(nrow(es6)-1)%/%50))
。数据帧es6大约有1096行和20列

我无法重现错误,有时成功,有时失败。(当我通过Rstudio服务器运行它时,代码就会工作。)

我怀疑服务器内存不足可能是原因。我的linux服务器总共有8GB内存。当我在运行任务时检查内存时,它有大约1700MB的可用空间(使用
free-m
命令)

我在互联网上搜索过,有人建议这种错误可能是由函数的bug引起的,即,
split

编辑:

更改为
拆分后(如.data.frame(es6),(0:(nrow(es6)-1)%/%50))
。新日志:

[2019-09-19 09:17:22,652] {{bash_operator.py:127}} INFO -  *** caught segfault ***
[2019-09-19 09:17:22,652] {{bash_operator.py:127}} INFO - address 0x55f600000000, cause 'memory not mapped'
[2019-09-19 09:17:22,652] {{bash_operator.py:127}} INFO - 
[2019-09-19 09:17:22,652] {{bash_operator.py:127}} INFO - Traceback:
[2019-09-19 09:17:22,652] {{bash_operator.py:127}} INFO -  1: dim(xj)
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO -  2: `[.data.frame`(x, ind, , drop = FALSE)
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO -  3: x[ind, , drop = FALSE]
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO -  4: FUN(X[[i]], ...)
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO -  5: lapply(split(x = seq_len(nrow(x)), f = f, drop = drop, ...),     function(ind) x[ind, , drop = FALSE])
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO -  6: split.data.frame(as.data.frame(es6), (0:(nrow(es6) - 1)%/%50))
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO -  7: split(as.data.frame(es6), (0:(nrow(es6) - 1)%/%50))
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO - An irrecoverable exception occurred. R is aborting now ...
[2019-09-19 09:17:23,179] {{bash_operator.py:127}} INFO - /tmp/airflowtmp9jy2rurg/web_etl_bf_7_days4x9d7xqz: line 1:  1220 Segmentation fault      (core dumped) Rscript /usr/local/airflow/dags/scripts/r/etl_web_api_by_create_time.R -d "2019-09-10 00:00:00+00:00"
[2019-09-19 09:17:23,179] {{bash_operator.py:131}} INFO - Command exited with return code 139

回溯信息显示,当调用
is.data.frame
时,segfault似乎会发生。这极不可能是罪魁祸首。然而,你似乎真的在处理一个tibble?如果有一个bug,它很可能与此相关。你能试试
拆分(as.data.frame(es6),(0:(nrow(es6)-1)%/%50))
?@Roland在添加
as.data.frame
后,有一个成功,一个失败。我编辑帖子以添加新日志。回溯信息显示,当调用
is.data.frame
时,segfault似乎会发生。这极不可能是罪魁祸首。然而,你似乎真的在处理一个tibble?如果有一个bug,它很可能与此相关。你能试试
拆分(as.data.frame(es6),(0:(nrow(es6)-1)%/%50))
?@Roland在添加
as.data.frame
后,有一个成功,一个失败。我编辑帖子以添加新日志。