当一个程序关闭并重新启动,而另一个程序仍在Java中运行时,如何恢复RMI通信?

当一个程序关闭并重新启动,而另一个程序仍在Java中运行时,如何恢复RMI通信?,java,rmi,Java,Rmi,所以我现在有很多代码,很难将它们全部分解成一个SSCCE,但如果必要的话,我可能会在以后尝试这样做 总之,这里是要点:我有两个进程通过RMI进行通信。它起作用了。但是,如果主机进程JobViewer退出并返回客户端进程作业生命周期中的所有内容,我希望能够继续通信 目前,每当作业启动时,我都会将绑定的名称保存到文件中,并且JobViewer会在启动时打开此文件。它工作得很好,正确的绑定名称工作得很好。但是,每次尝试恢复与作业的通信时,我都会收到一个NotBoundException,我知道当Job

所以我现在有很多代码,很难将它们全部分解成一个SSCCE,但如果必要的话,我可能会在以后尝试这样做

总之,这里是要点:我有两个进程通过RMI进行通信。它起作用了。但是,如果主机进程JobViewer退出并返回客户端进程作业生命周期中的所有内容,我希望能够继续通信

目前,每当作业启动时,我都会将绑定的名称保存到文件中,并且JobViewer会在启动时打开此文件。它工作得很好,正确的绑定名称工作得很好。但是,每次尝试恢复与作业的通信时,我都会收到一个NotBoundException,我知道当JobViewer重新启动时,该作业实际上仍在运行

My JobViewer实现了一个通过以下方法扩展Remote的接口:

public void registerClient(String bindedName, JobStateSummary jobSummary) throws RemoteException, NotBoundException;
public void giveJobStateSummary(JobStateSummary jobSummary) throws RemoteException;
public void signalEndOfClient(JobStateSummary jobSummary) throws RemoteException;
public JobStateSummary getJobStateSummary() throws RemoteException;
public void killRemoteJob() throws RemoteException;
public void stopRemoteJob() throws RemoteException;
public void resumeRemoteJob() throws RemoteException;
我的作业还实现了一个不同的接口,该接口通过以下方法扩展Remote:

public void registerClient(String bindedName, JobStateSummary jobSummary) throws RemoteException, NotBoundException;
public void giveJobStateSummary(JobStateSummary jobSummary) throws RemoteException;
public void signalEndOfClient(JobStateSummary jobSummary) throws RemoteException;
public JobStateSummary getJobStateSummary() throws RemoteException;
public void killRemoteJob() throws RemoteException;
public void stopRemoteJob() throws RemoteException;
public void resumeRemoteJob() throws RemoteException;
我如何做到这一点?下面是我当前初始化RMI的一些代码,如果它有帮助的话

JobViewer端:

private Registry _registry;
// Set up RMI
_registry = LocateRegistry.createRegistry(2002);
_registry.rebind("JOBVIEWER_SERVER", this);
工作方面:

private NiceRemoteJobMonitor _server;

Registry registry = LocateRegistry.getRegistry(hostName, port);
registry.rebind(_bindedClientName, this);
Remote remoteServer = registry.lookup(masterName);

_server = (NiceRemoteJobMonitor)remoteServer;
_server.registerClient(_bindedClientName, _jobStateSummary);
每次尝试恢复与作业的通信时,我都会收到一个NotBoundException,我知道当JobViewer重新启动时,该作业实际上仍在运行

只有当JobViewer启动时没有重新绑定时,才会发生这种情况。更常见的情况是,当您使用过时的存根(即远程对象已退出的存根)时,会出现NoSuchObjectException。在这种情况下,您应该重新获取存根,即重新进行查找


为什么客户端将自己绑定到注册表?如果要注册回调,只需将其传递给registerClient方法而不是绑定名称,并使用客户端的远程接口作为参数类型相应地调整其签名。无需让服务器查找客户端注册表。根本不需要客户端注册表。

我的解决方案是让作业每隔一段时间ping一次JobViewer:

  while (true) {

    try {

      _server.ping();
      // If control reaches here we were able to successfully ping the job monitor.

    } catch (Exception e) {

      System.out.println("Job lost contact with the job monitor at " + new Date().toString() + " ...");

      // If control reaches we were unable to ping the job monitor.  Now we will loop until it presumably comes back to life.
      boolean foundServer = false;
      while (!foundServer) {

        try {

          // Attempt to register again.
          Registry registry = LocateRegistry.getRegistry(_hostName, _port);
          registry.rebind(_bindedClientName, NiceSupervisor.this);
          Remote remoteServer = registry.lookup(_masterName);
          _server = (NiceRemoteJobMonitor)remoteServer;
          _server.registerClient(_bindedClientName, _jobStateSummary);

          // Ping the server for good measure.
          _server.ping();

          System.out.println("Job reconnected with the job monitor at " + new Date().toString() + " ...");

          // If control reaches here we were able to reconnect to the job monitor and ping it again.
          foundServer = true;

        } catch (Exception x) {

          System.out.println("Job still cannot contact the job monitor at " + new Date().toString() + " ...");

        }

       // Sleep for 1 minute before we try to locate the registry again.
        try {
          Thread.currentThread().sleep(PING_WAIT_TIME);
        } catch (InterruptedException x) {

        }

     } // End of endless loop until we find the server again.

   }

    // Sleep for 1 minute after we ping the server before we try again.
    try {
      Thread.currentThread().sleep(PING_WAIT_TIME);
    } catch (InterruptedException e) {

    }

  }  // End of endless loop that we never exit.

不加选择地重复任意异常不能作为解决方案。您应该在大多数但不是所有RemoteException上重复。其他各种异常(如NOE)都表明代码中存在错误,而不是需要重试。我再说一遍,客户机根本不需要将自己绑定到注册表。