当一个程序关闭并重新启动,而另一个程序仍在Java中运行时,如何恢复RMI通信?
所以我现在有很多代码,很难将它们全部分解成一个SSCCE,但如果必要的话,我可能会在以后尝试这样做 总之,这里是要点:我有两个进程通过RMI进行通信。它起作用了。但是,如果主机进程JobViewer退出并返回客户端进程作业生命周期中的所有内容,我希望能够继续通信 目前,每当作业启动时,我都会将绑定的名称保存到文件中,并且JobViewer会在启动时打开此文件。它工作得很好,正确的绑定名称工作得很好。但是,每次尝试恢复与作业的通信时,我都会收到一个NotBoundException,我知道当JobViewer重新启动时,该作业实际上仍在运行 My JobViewer实现了一个通过以下方法扩展Remote的接口:当一个程序关闭并重新启动,而另一个程序仍在Java中运行时,如何恢复RMI通信?,java,rmi,Java,Rmi,所以我现在有很多代码,很难将它们全部分解成一个SSCCE,但如果必要的话,我可能会在以后尝试这样做 总之,这里是要点:我有两个进程通过RMI进行通信。它起作用了。但是,如果主机进程JobViewer退出并返回客户端进程作业生命周期中的所有内容,我希望能够继续通信 目前,每当作业启动时,我都会将绑定的名称保存到文件中,并且JobViewer会在启动时打开此文件。它工作得很好,正确的绑定名称工作得很好。但是,每次尝试恢复与作业的通信时,我都会收到一个NotBoundException,我知道当Job
public void registerClient(String bindedName, JobStateSummary jobSummary) throws RemoteException, NotBoundException;
public void giveJobStateSummary(JobStateSummary jobSummary) throws RemoteException;
public void signalEndOfClient(JobStateSummary jobSummary) throws RemoteException;
public JobStateSummary getJobStateSummary() throws RemoteException;
public void killRemoteJob() throws RemoteException;
public void stopRemoteJob() throws RemoteException;
public void resumeRemoteJob() throws RemoteException;
我的作业还实现了一个不同的接口,该接口通过以下方法扩展Remote:
public void registerClient(String bindedName, JobStateSummary jobSummary) throws RemoteException, NotBoundException;
public void giveJobStateSummary(JobStateSummary jobSummary) throws RemoteException;
public void signalEndOfClient(JobStateSummary jobSummary) throws RemoteException;
public JobStateSummary getJobStateSummary() throws RemoteException;
public void killRemoteJob() throws RemoteException;
public void stopRemoteJob() throws RemoteException;
public void resumeRemoteJob() throws RemoteException;
我如何做到这一点?下面是我当前初始化RMI的一些代码,如果它有帮助的话
JobViewer端:
private Registry _registry;
// Set up RMI
_registry = LocateRegistry.createRegistry(2002);
_registry.rebind("JOBVIEWER_SERVER", this);
工作方面:
private NiceRemoteJobMonitor _server;
Registry registry = LocateRegistry.getRegistry(hostName, port);
registry.rebind(_bindedClientName, this);
Remote remoteServer = registry.lookup(masterName);
_server = (NiceRemoteJobMonitor)remoteServer;
_server.registerClient(_bindedClientName, _jobStateSummary);
每次尝试恢复与作业的通信时,我都会收到一个NotBoundException,我知道当JobViewer重新启动时,该作业实际上仍在运行
只有当JobViewer启动时没有重新绑定时,才会发生这种情况。更常见的情况是,当您使用过时的存根(即远程对象已退出的存根)时,会出现NoSuchObjectException。在这种情况下,您应该重新获取存根,即重新进行查找
为什么客户端将自己绑定到注册表?如果要注册回调,只需将其传递给registerClient方法而不是绑定名称,并使用客户端的远程接口作为参数类型相应地调整其签名。无需让服务器查找客户端注册表。根本不需要客户端注册表。我的解决方案是让作业每隔一段时间ping一次JobViewer:
while (true) {
try {
_server.ping();
// If control reaches here we were able to successfully ping the job monitor.
} catch (Exception e) {
System.out.println("Job lost contact with the job monitor at " + new Date().toString() + " ...");
// If control reaches we were unable to ping the job monitor. Now we will loop until it presumably comes back to life.
boolean foundServer = false;
while (!foundServer) {
try {
// Attempt to register again.
Registry registry = LocateRegistry.getRegistry(_hostName, _port);
registry.rebind(_bindedClientName, NiceSupervisor.this);
Remote remoteServer = registry.lookup(_masterName);
_server = (NiceRemoteJobMonitor)remoteServer;
_server.registerClient(_bindedClientName, _jobStateSummary);
// Ping the server for good measure.
_server.ping();
System.out.println("Job reconnected with the job monitor at " + new Date().toString() + " ...");
// If control reaches here we were able to reconnect to the job monitor and ping it again.
foundServer = true;
} catch (Exception x) {
System.out.println("Job still cannot contact the job monitor at " + new Date().toString() + " ...");
}
// Sleep for 1 minute before we try to locate the registry again.
try {
Thread.currentThread().sleep(PING_WAIT_TIME);
} catch (InterruptedException x) {
}
} // End of endless loop until we find the server again.
}
// Sleep for 1 minute after we ping the server before we try again.
try {
Thread.currentThread().sleep(PING_WAIT_TIME);
} catch (InterruptedException e) {
}
} // End of endless loop that we never exit.
不加选择地重复任意异常不能作为解决方案。您应该在大多数但不是所有RemoteException上重复。其他各种异常(如NOE)都表明代码中存在错误,而不是需要重试。我再说一遍,客户机根本不需要将自己绑定到注册表。