Apache spark 为什么Spark独立工作节点1在收到信号15:SIGTERM后终止?

Apache spark 为什么Spark独立工作节点1在收到信号15:SIGTERM后终止?,apache-spark,sigterm,apache-spark-standalone,Apache Spark,Sigterm,Apache Spark Standalone,注意:此错误是在spark执行组件之前引发的 日志 工作节点1: 17/05/18 23:12:52 INFO Worker: Successfully registered with master spark://spark-master-1.com:7077 17/05/18 23:58:41 ERROR Worker: RECEIVED SIGNAL 15: SIGTERM 主节点: 17/05/18 23:12:52 INFO Master: Registering worker

注意:此错误是在spark执行组件之前引发的

日志
工作节点1:

17/05/18 23:12:52 INFO Worker: Successfully registered with master spark://spark-master-1.com:7077  
17/05/18 23:58:41 ERROR Worker: RECEIVED SIGNAL 15: SIGTERM
主节点:

17/05/18 23:12:52 INFO Master: Registering worker spark-worker-1com:56056 with 2 cores, 14.5 GB RAM
17/05/18 23:14:20 INFO Master: Registering worker spark-worker-2.com:53986 with 2 cores, 14.5 GB RAM
17/05/18 23:59:42 WARN Master: Removing spark-worker-1com-56056 because we got no heartbeat in 60 seconds
17/05/18 23:59:42 INFO Master: Removing spark-worker-2.com:56056
17/05/19 00:00:03 ERROR Master: RECEIVED SIGNAL 15: SIGTERM
工作节点2:

17/05/18 23:14:20 INFO Worker: Successfully registered with master spark://spark-master-node-2.com:7077
17/05/18 23:59:40 ERROR Worker: RECEIVED SIGNAL 15: SIGTERM

TL;DR我认为有人明确地调用了
kill
命令或
sbin/stop worker.sh

“接收到的信号15:SIGTERM”由记录在类UNIX系统上的
TERM
HUP
INT
信号的报告:

  /** Register a signal handler to log signals on UNIX-like systems. */
  def registerLogger(log: Logger): Unit = synchronized {
    if (!loggerRegistered) {
      Seq("TERM", "HUP", "INT").foreach { sig =>
        SignalUtils.register(sig) {
          log.error("RECEIVED SIGNAL " + sig)
          false
        }
      }
      loggerRegistered = true
    }
  }
在您的情况下,这意味着接收到的进程将自行停止:

SIGTERM信号是用于导致程序终止的通用信号。与SIGKILL不同,此信号可以被阻止、处理和忽略。这是礼貌地要求程序终止的正常方式

这就是在执行
KILL
或使用
/sbin/stop master.sh
/sbin/stop worker.sh
shell脚本时发送的内容,这些脚本依次调用
sbin/spark daemon.sh
,并使用
stop
命令:


谢谢你,雅克:)
kill "$TARGET_ID" && rm -f "$pid"