远程Apache Ignite群集连接失败

远程Apache Ignite群集连接失败,ignite,Ignite,我可以在本地Docker服务器上成功加入并保留作为Docker容器运行的单节点ApacheIgnite2.8.1拓扑 运行完全相同的程序,但在远程Docker服务器上,我可以看到我的程序加入群集拓扑,但在连接完成之前,我收到以下连接错误 SEVERE: Failed to send message to remote node [node=TcpDiscoveryNode [id=a239f009-bddd-4a06-845f-abb304850849, consistentId=127.0.0

我可以在本地Docker服务器上成功加入并保留作为Docker容器运行的单节点ApacheIgnite2.8.1拓扑

运行完全相同的程序,但在远程Docker服务器上,我可以看到我的程序加入群集拓扑,但在连接完成之前,我收到以下连接错误

SEVERE: Failed to send message to remote node [node=TcpDiscoveryNode [id=a239f009-bddd-4a06-845f-abb304850849, consistentId=127.0.0.1,172.17.0.13:42002, addrs=ArrayList [127.0.0.1, 172.17.0.13], sockAddrs=HashSet [/172.17.0.13:42002, /127.0.0.1:42002], discPort=42002, order=1, intOrder=1, lastExchangeTime=1605015503009, loc=false, ver=2.8.1#20200521-sha1:86422096, isClient=false], msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtPartitionsSingleMessage [parts=null, partCntrs=null, partsSizes=null, partHistCntrs=null, err=null, client=true, exchangeStartTime=106333448635300, finishMsg=null, super=GridDhtPartitionsAbstractMessage [exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=dc9a3700-5377-4095-ac2b-31a2cea3d9a5, consistentId=dc9a3700-5377-4095-ac2b-31a2cea3d9a5, addrs=ArrayList [0:0:0:0:0:0:0:1, 10.91.7.30, 127.0.0.1, 192.168.1.81, 192.168.38.1], sockAddrs=HashSet [host.docker.internal/192.168.1.81:0, /0:0:0:0:0:0:0:1:0, GBLG7Y7GH2.mshome.net/192.168.38.1:0, /127.0.0.1:0, GBLG7Y7GH2.enterprisenet.org/10.91.7.30:0], discPort=0, order=2, intOrder=0, lastExchangeTime=1605015498538, loc=true, ver=2.8.1#20200521-sha1:86422096, isClient=true], topVer=2, nodeId8=dc9a3700, msg=null, type=NODE_JOINED, tstamp=1605015505481], nodeId=dc9a3700, evt=NODE_JOINED], lastVer=GridCacheVersion [topVer=0, order=1605015496511, nodeOrder=0], super=GridCacheMessage [msgId=1, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], err=null, skipPrepare=false]]]]]
class org.apache.ignite.IgniteCheckedException: Failed to connect to node (is node still alive?). Make sure that each ComputeTask and cache Transaction has a timeout set in order to prevent parties from waiting forever in case of network issues [nodeId=a239f009-bddd-4a06-845f-abb304850849, addrs=[/172.17.0.13:42003, /127.0.0.1:42003]]
    at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3738)
    at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3458)
    at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createCommunicationClient(TcpCommunicationSpi.java:3198)
    at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3078)
    at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2918)
    at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2877)
    at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2035)
    at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2132)
    at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1257)
    at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendLocalPartitions(GridDhtPartitionsExchangeFuture.java:2020)
    at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.clientOnlyExchange(GridDhtPartitionsExchangeFuture.java:1436)
    at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:903)
    at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3214)
    at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3063)
    at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
    at java.lang.Thread.run(Thread.java:748)
    Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to node (is node still alive?). Make sure that each ComputeTask and cache Transaction has a timeout set in order to prevent parties from waiting forever in case of network issues [nodeId=a239f009-bddd-4a06-845f-abb304850849, addrs=[/172.17.0.13:42003, /127.0.0.1:42003]]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3740)
        ... 15 more
    Caused by: java.net.SocketTimeoutException
        at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:129)
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3584)
        ... 15 more
Caused by: java.net.SocketTimeoutException
    at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:129)
    at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioSession(TcpCommunicationSpi.java:3584)
    ... 15 more

在我看来,问题与客户端连接设置有关,因此我尝试增加客户端发现SPI“joinTimeout”、“networkTimeout”和“socketTimeout”设置以及“connectionTimeout”和“socketWriteTimeout”设置但未成功。

是否在Docker和远程节点之间双向打开了47500和45100端口?

必须为在远程Docker容器内运行的节点设置一个
地址解析程序

看看:

如果您使用的是Spring配置,那么您的配置应该如下所示:

    <property name="addressResolver">
        <bean class="org.apache.ignite.configuration.BasicAddressResolver">
            <constructor-arg>
                <map>
                    <entry key="172.31.59.27" value="3.93.186.198"/>
                </map>
            </constructor-arg>
        </bean>
    </property>

    <!-- other properties -->

    <!-- Discovery configuration -->
</bean>


这里172.31.59.27是一个内部IP,3.93.186.198是一个外部IP,您正在连接。

通信SPI似乎无法访问客户端节点a239f009-bddd-4a06-845f-abb304850849。您有什么通信配置?您是否在Docker中正确映射了通信端口?是的,所有端口都映射到远程Docker中。但是,我想我已经弄明白了。这似乎是一个不可能的场景,因为我的远程Docker服务器运行在公共VPS上,通信SPI无法将自己绑定到我的VPS的公共IP。因此,客户端成功地发现了集群,但随后尝试与绑定该集群的地址上的通信SPI通信,该地址不是Docker服务器运行的VPS的公共internet IP。我的理解正确吗?我使用端口42002进行查找,使用端口42003进行通信。客户端计算机可以访问这两个端口。但是,我担心远程Docker上的通信SPI只能将自身绑定到其本地地址,而不能绑定到客户端可见的公共IP地址。我担心这就是为什么在本地Docker服务器上运行相同实验的原因。客户端成功发现群集,但随后尝试与绑定地址上的通信SPI通信,该地址不是远程Docker服务器的公共internet IP,而远程Docker服务器是客户端唯一可见的。是否有一些切换到Docker的开关来更改其绑定行为?我知道它不起作用,但我已经尝试过了将Ignite Docker容器的通信端口绑定到Docker服务器的公共IP。它不起作用。remote Ignite客户端节点成功协商发现,但随后remote Ignite服务器节点似乎正在发回其通信端口的坐标,这些坐标无法从远程网络访问。恐怕所有这些听起来都是正确的,并且符合点火设计。我不知道有哪种Docker设置会允许这种特殊的绑定:-)我认为问题在于Ignite comm协议。您可以在
TcpCommunicationSpi
配置中设置
localAddress
来指定正确的地址。很遗憾,正如我所说,尽管我知道这行不通,但我还是这么做了。我使用了Docker服务器的公共internet IP,但出现了绑定错误。我越来越觉得Ignite通信协议是专为专用网络设计的。谢谢,我觉得需要一个AddressResolver,但我不知道如何使用它。我试试这个