Akka远程执行器系统超时:“;远程系统已隔离此系统”;

Akka远程执行器系统超时:“;远程系统已隔离此系统”;,akka,akka-remoting,Akka,Akka Remoting,我有3个基于远程处理(非群集)的akka节点(akka 2.4.8 actor系统)。当创建远程actor并执行长任务(需要30分钟以上)时,我将从远程actor系统(在远程计算机上)收到错误:远程系统已隔离此系统: 来自本地系统: 2016-08-11 03:29:12.748UTC警告[PLM akka.actor.default-dispatcher-27]远程观察者| akka。tcp://PLM@flowsvr02:46407/系统/远程监视程序|检测到无法访问:[akka。tcp:/

我有3个基于远程处理(非群集)的akka节点(akka 2.4.8 actor系统)当创建远程actor并执行长任务(需要30分钟以上)时,我将从远程actor系统(在远程计算机上)收到错误:远程系统已隔离此系统:

来自本地系统: 2016-08-11 03:29:12.748UTC警告[PLM akka.actor.default-dispatcher-27]远程观察者| akka。tcp://PLM@flowsvr02:46407/系统/远程监视程序|检测到无法访问:[akka。tcp://AS1@lxsvr01g:9500] 2016-08-11 03:29:12.787UTC警告[PLM akka.actor.default-dispatcher-14]远程处理与[akka.remote.Remoting]关联。tcp://AS1@具有UID[1284261532]的lxsvr01g:9500]无法恢复。UID现在被隔离,发送到此UID的所有消息都将以死信的形式发送必须重新启动远程Actor系统才能从这种情况中恢复。

来自远程系统: 00:41:05.169UTC警告[AS2-akka.actor.default-dispatcher-5]eliableDeliverySupervisor | eEndpointWriter akka.tcp%3A%2F%2FPLM%40flowsvr02%3A36210-0 |与远程系统的关联[akka]。tcp://PLM@flowsvr02:36210]已失败,地址现在已为[5000]毫秒选通。原因:[已解除关联]
01:06:23.138UTC错误[AS2 akka.actor.default-dispatcher-17]EndpointWriter |/EndpointWriter akka.tcp%3A%2F%2FPLM%40ftflowsvr02%3A36210-1 | AssociationError[akka]。tcp://AS2@lxsvr02g:9500]你能发布一些你正在使用的代码吗?无论是本地还是远程,如果没有这些代码,你都不能说你到底做错了什么。@Amit Yadav谢谢,使用代码更新。请发布application.conf以及本地系统的application.conf更新。您在actor的receive块中执行的代码是否可能将远程actor锁定?您是否可以尝试在将来的块中执行计算,以便它使用执行器池,并且不会锁定参与者的心脏跳动。记住绑定
val send=sender
并在
future.onComplete
do
send!结果
。如果你在这方面取得了成功,我可以为你提供一个完整的例子。你能发布一些你正在使用的代码吗,无论是本地还是远程的,如果没有这些,我就说不出你到底做错了什么。@Amit Yadav谢谢,使用代码更新。请发布application.conf以及本地系统的application.conf更新。您在actor的receive块中执行的代码是否可能将远程actor锁定?您是否可以尝试在将来的块中执行计算,以便它使用执行器池,并且不会锁定参与者的心脏跳动。记住绑定
val send=sender
并在
future.onComplete
do
send!结果
。如果你在这方面取得了成功,我可以为你提供一个完整的例子。
      val remoteConfig = new RemotingConfig("application.conf")
      val plmRmRepo = new ResourceManagerDBHandler(config.getString("database.txs_db"))
      val remotingManager: ActorRef = system.actorOf(Props(new RemotingManager(plmRmRepo, remoteConfig, system)), name="RemotingManager")
      val rmWorker: ActorRef = createRemoteActor(request, rm)
      requestActor ! ResourceResponse(request.id, request.taskType, request.originalSender, Some(rmWorker))
      log.info(s"remote actor is created: " + rmWorker.toString())
      def createRemoteActor(request: ResourceRequest, rm: ResourceManagerClass): ActorRef = {
          log.info(s"RemotingManager: @" + rm.nodeName + "to create remote actor..." + request.implementation)
          val delegateClass = Class.forName(request.implementation)
          val remoteASAddress = Address(rm.protocol, rm.nodeName, rm.host, rm.port)
          system.actorOf(Props(delegateClass).
               withDeploy(Deploy(scope = RemoteScope(remoteASAddress))))
      object RemoteMain extends App {
      //val config = ConfigFactory.load("remotesystem.conf")
      val config = ConfigFactory.load()
      var remoteSystemName = config.getString("RemoteSystem.nodeName")

      //create an actor system with that config
      val system = ActorSystem(remoteSystemName, config)
      implicit val executor = system.dispatcher

      //val defaultActor = system.actorOf(Props[RemoteActorSystem], remoteConfig.className)
      system.log.info("## Remote Manager Is Started ##")
    }
akka {
  loggers = ["akka.event.slf4j.Slf4jLogger"]
  loglevel = DEBUG
  #logging-filter = "akka.event.slf4j.Slf4jLoggingFilter"
  #log-config-on-start = on
  log-dead-letters = 10
  log-dead-letters-during-shutdown = on
  logger-startup-timeout = 30s
  actor {
     serializers {
      akka-containers = "akka.remote.serialization.MessageContainerSerializer"
      akka-misc = "akka.remote.serialization.MiscMessageSerializer"
      proto = "akka.remote.serialization.ProtobufSerializer"
      daemon-create = "akka.remote.serialization.DaemonMsgCreateSerializer"
    }

    serialization-bindings {
      "akka.actor.ActorSelectionMessage" = akka-containers
      # The classes akka.actor.Identify and akka.actor.ActorIdentity serialization/deserialization are required by
      # the cluster client to work.
      # For the purpose of preserving protocol backward compatibility, akka.actor.Identify and akka.actor.ActorIdentity
      # are stil using java serialization by default.
      # Should java serialization be disabled, uncomment the following lines
      # "akka.actor.Identify" = akka-misc
      # "akka.actor.ActorIdentity" = akka-misc
      # Should java serialization be disabled, uncomment the following lines
      # "scala.Some" = akka-misc
      # "scala.None$" = akka-misc
      "akka.remote.DaemonMsgCreate" = daemon-create

      # Since akka.protobuf.Message does not extend Serializable but
      # GeneratedMessage does, need to use the more specific one here in order
      # to avoid ambiguity.
      "akka.protobuf.GeneratedMessage" = proto

      # Since com.google.protobuf.Message does not extend Serializable but
      # GeneratedMessage does, need to use the more specific one here in order
      # to avoid ambiguity.
      # This com.google.protobuf serialization binding is only used if the class can be loaded,
      # i.e. com.google.protobuf dependency has been added in the application project.
      "com.google.protobuf.GeneratedMessage" = proto

    }

    serialization-identifiers {
      "akka.remote.serialization.ProtobufSerializer" = 2
      "akka.remote.serialization.DaemonMsgCreateSerializer" = 3
      "akka.remote.serialization.MessageContainerSerializer" = 6
      "akka.remote.serialization.MiscMessageSerializer" = 16
    }

    debug {
      receive = on
      # enable DEBUG logging of all AutoReceiveMessages (Kill, PoisonPill et.c.)
      autoreceive = on
      # enable DEBUG logging of actor lifecycle changes
      lifecycle = on
      # enable DEBUG logging of unhandled messages
      unhandled = on
    }

    warn-about-java-serializer-usage = false

    provider = "akka.remote.RemoteActorRefProvider"
  }

  remote {
    # If this is "on", Akka will log all outbound messages at DEBUG level
    log-sent-messages = on
    # If this is "on", Akka will log all inbound messages at DEBUG level
    log-received-messages = on

    enabled-transports = ["akka.remote.netty.tcp"]
    netty.tcp {
      hostname = "ftflowsvr02"
      port = 9888
      tcp-keepalive = on      
    }
    transport-failure-detector {
      implementation-class = "akka.remote.DeadlineFailureDetector"
      heartbeat-interval = 5 s
      acceptable-heartbeat-pause = 300 s      
    }   

    watch-failure-detector {
      # FQCN of the failure detector implementation.
      # It must implement akka.remote.FailureDetector and have
      # a public constructor with a com.typesafe.config.Config and
      # akka.actor.EventStream parameter.
      implementation-class = "akka.remote.PhiAccrualFailureDetector"

      # How often keep-alive heartbeat messages should be sent to each connection.
      heartbeat-interval = 5 s

      # Defines the failure detector threshold.
      # A low threshold is prone to generate many wrong suspicions but ensures
      # a quick detection in the event of a real crash. Conversely, a high
      # threshold generates fewer mistakes but needs more time to detect
      # actual crashes.
      threshold = 300.0

      # Number of the samples of inter-heartbeat arrival times to adaptively
      # calculate the failure timeout for connections.
      max-sample-size = 200

      # Minimum standard deviation to use for the normal distribution in
      # AccrualFailureDetector. Too low standard deviation might result in
      # too much sensitivity for sudden, but normal, deviations in heartbeat
      # inter arrival times.
      min-std-deviation = 100 ms

      # Number of potentially lost/delayed heartbeats that will be
      # accepted before considering it to be an anomaly.
      # This margin is important to be able to survive sudden, occasional,
      # pauses in heartbeat arrivals, due to for example garbage collect or
      # network drop.
      acceptable-heartbeat-pause = 300 s


      # How often to check for nodes marked as unreachable by the failure
      # detector
      unreachable-nodes-reaper-interval = 5s

      # After the heartbeat request has been sent the first failure detection
      # will start after this period, even though no heartbeat mesage has
      # been received.
      expected-response-after = 5 s

    }

    retry-gate-closed-for = 60 s

    quarantine-after-silence = 5 d

    resend-interval = 5 s   

    resend-limit = 200  

    default-remote-dispatcher {
      type = Dispatcher
      executor = "fork-join-executor"
      fork-join-executor {
        # Min number of threads to cap factor-based parallelism number to
        parallelism-min = 2
        parallelism-max = 2
      }
    }   

  }
}