C# 升级到SDK 2.3.301后,服务结构参与者或服务随机变得不可访问
从Service Fabric SDK 2.0.135升级到2.3.301后,我们开始遇到这样的情况:尽管在Service Fabric Explorer中显示为正常,但Service Fabric参与者或服务仍无法访问。一旦处于这种状态,通过ActorProxy或ServiceProxy对actor或服务的任何调用都将挂起5分钟,然后最终发出TimeoutException。一旦进入这种状态,参与者或服务将永远无法自行恢复,即使只剩下一个小时。唯一的解决方案是重置参与者或服务所在的节点、重新部署参与者或服务(完全相同的EXE)、重置整个群集或重新启动所有群集计算机 它通常在部署或重新部署SF应用程序后进入这种状态 在使用服务结构的最后一年(自SDK v1.3以来),我们从未遇到过这个问题。它只是在移动到2.3.301之后才开始 这似乎是随机和不一致的。我们解决方案中的13个SF应用程序中,哪一个也会受到影响是随机的 有人对我们如何解决这个问题有什么想法吗?这看起来像是最新版本的ServiceFabric中的一个bug,但也许我们在这方面做错了什么 感谢您的帮助 下面是许多额外的信息,我希望这些信息将有助于理解我们在这个问题上面临的问题 非常感谢 步骤 我真的没有办法持续地重现这个问题。这就是我有时观察到的C# 升级到SDK 2.3.301后,服务结构参与者或服务随机变得不可访问,c#,azure,visual-studio-2015,azure-service-fabric,service-fabric-stateful,C#,Azure,Visual Studio 2015,Azure Service Fabric,Service Fabric Stateful,从Service Fabric SDK 2.0.135升级到2.3.301后,我们开始遇到这样的情况:尽管在Service Fabric Explorer中显示为正常,但Service Fabric参与者或服务仍无法访问。一旦处于这种状态,通过ActorProxy或ServiceProxy对actor或服务的任何调用都将挂起5分钟,然后最终发出TimeoutException。一旦进入这种状态,参与者或服务将永远无法自行恢复,即使只剩下一个小时。唯一的解决方案是重置参与者或服务所在的节点、重新部
- VisualStudio表示部署成功
- Service Fabric Explorer显示一切正常
- 任务管理器显示EXE的两个运行副本
"exception": {
"ClassName": "System.TimeoutException",
"Message": "This can happen if message is dropped when service is busy or its long running operation and taking more time than configured Operation Timeout.",
"Data": null,
"InnerException": null,
"HelpURL": null,
"StackTraceString": " at Microsoft.ServiceFabric.Services.Communication.Client.ServicePartitionClient`1.<InvokeWithRetryAsync>d__7`1.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at Microsoft.ServiceFabric.Services.Remoting.Client.ServiceRemotingPartitionClient.<InvokeAsync>d__8.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at Microsoft.ServiceFabric.Services.Remoting.Builder.ProxyBase.<InvokeAsync>d__0.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at Microsoft.ServiceFabric.Services.Remoting.Builder.ProxyBase.<ContinueWithResult>d__7`1.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()\r\n at RenderingCachingEngine.RenderingCachingEngine.<Render>d__10.MoveNext() in C:\\Code\\Ink\\Dev\\Current\\Source\\Rendering Service Fabric\\RenderingCachingEngine\\RenderingCachingEngine.cs:line 381",
"RemoteStackTraceString": null,
"RemoteStackIndex": 0,
"ExceptionMethod": "8\nMoveNext\nMicrosoft.ServiceFabric.Services, Version=5.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35\nMicrosoft.ServiceFabric.Services.Communication.Client.ServicePartitionClient`1+<InvokeWithRetryAsync>d__7`1\nVoid MoveNext()",
"HResult": -2146233083,
"Source": "Microsoft.ServiceFabric.Services",
"WatsonBuckets": null
}
"serviceFabricInfo": {
"serviceFabricServiceName": "fabric:/Rendering/RenderingCachingEngine",
"serviceFabricServiceTypeName": "RenderingCachingEngineType",
"serviceFabricReplicaId": 131225099453058851,
"serviceFabricPartitionId": "e400087d-8a08-4dab-bcdd-1f5ce82f374f",
"serviceFabricApplicationName": "fabric:/Rendering",
"serviceFabricApplicationTypeName": "RenderingType",
"serviceFabricNodeName": "_Node_4"
}
Log Name: Microsoft-ServiceFabric/Admin
Source: Microsoft-ServiceFabric
Date: 11/2/2016 2:38:53 PM
Event ID: 256
Task Category: Common
Level: Error
Keywords: Default
User: NETWORK SERVICE
Computer: shayward10.ovx.local
Description:
WriteNode failed. HRESULT=-2147467259, Output=CustomOutput
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-ServiceFabric" Guid="{CBD93BC2-71E5-4566-B3A7-595D8EECA6E8}" />
<EventID>256</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>1</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000001</Keywords>
<TimeCreated SystemTime="2016-11-02T18:38:53.678587200Z" />
<EventRecordID>7620</EventRecordID>
<Correlation />
<Execution ProcessID="4440" ThreadID="7360" />
<Channel>Microsoft-ServiceFabric/Admin</Channel>
<Computer>shayward10.ovx.local</Computer>
<Security UserID="S-1-5-20" />
</System>
<EventData>
<Data Name="id">
</Data>
<Data Name="type">XmlLiteWriter</Data>
<Data Name="text">WriteNode failed. HRESULT=-2147467259, Output=CustomOutput</Data>
</EventData>
</Event>
Log Name: Microsoft-ServiceFabric/Admin
Source: Microsoft-ServiceFabric
Date: 11/2/2016 2:38:54 PM
Event ID: 23073
Task Category: Hosting
Level: Warning
Keywords: Default
User: SYSTEM
Computer: shayward10.ovx.local
Description:
ServiceHostProcess: DataBinding.exe for ApplicationId 805915c7-456c-49d3-af95-62cc44650664 terminated unexpectedly with exit code 3221225786 on node id bf865279ba277deb864a976fbf4c200e
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-ServiceFabric" Guid="{CBD93BC2-71E5-4566-B3A7-595D8EECA6E8}" />
<EventID>23073</EventID>
<Version>0</Version>
<Level>3</Level>
<Task>90</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000001</Keywords>
<TimeCreated SystemTime="2016-11-02T18:38:54.820567800Z" />
<EventRecordID>7621</EventRecordID>
<Correlation />
<Execution ProcessID="6944" ThreadID="3812" />
<Channel>Microsoft-ServiceFabric/Admin</Channel>
<Computer>shayward10.ovx.local</Computer>
<Security UserID="S-1-5-18" />
</System>
<EventData>
<Data Name="id">bf865279ba277deb864a976fbf4c200e</Data>
<Data Name="AppId">805915c7-456c-49d3-af95-62cc44650664</Data>
<Data Name="ReturnCode">3221225786</Data>
<Data Name="ProcessName">DataBinding.exe</Data>
</EventData>
</Event>
Log Name: Microsoft-ServiceFabric/Admin
Source: Microsoft-ServiceFabric
Date: 11/2/2016 2:38:56 PM
Event ID: 256
Task Category: Common
Level: Error
Keywords: Default
User: NETWORK SERVICE
Computer: shayward10.ovx.local
Description:
WriteNode failed. HRESULT=-2147467259, Output=CustomOutput
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-ServiceFabric" Guid="{CBD93BC2-71E5-4566-B3A7-595D8EECA6E8}" />
<EventID>256</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>1</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000001</Keywords>
<TimeCreated SystemTime="2016-11-02T18:38:56.261857600Z" />
<EventRecordID>7627</EventRecordID>
<Correlation />
<Execution ProcessID="4440" ThreadID="8564" />
<Channel>Microsoft-ServiceFabric/Admin</Channel>
<Computer>shayward10.ovx.local</Computer>
<Security UserID="S-1-5-20" />
</System>
<EventData>
<Data Name="id">
</Data>
<Data Name="type">XmlLiteWriter</Data>
<Data Name="text">WriteNode failed. HRESULT=-2147467259, Output=CustomOutput</Data>
</EventData>
</Event>
Log Name: Microsoft-ServiceFabric/Admin
Source: Microsoft-ServiceFabric
Date: 11/2/2016 2:44:55 PM
Event ID: 44289
Task Category: FabricTransport
Level: Warning
Keywords: Default
User: NETWORK SERVICE
Computer: shayward10.ovx.local
Description:
Error While Sending Message : FABRIC_E_TIMEOUT
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-ServiceFabric" Guid="{CBD93BC2-71E5-4566-B3A7-595D8EECA6E8}" />
<EventID>44289</EventID>
<Version>0</Version>
<Level>3</Level>
<Task>173</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000001</Keywords>
<TimeCreated SystemTime="2016-11-02T18:44:55.349048200Z" />
<EventRecordID>7629</EventRecordID>
<Correlation />
<Execution ProcessID="18600" ThreadID="8076" />
<Channel>Microsoft-ServiceFabric/Admin</Channel>
<Computer>shayward10.ovx.local</Computer>
<Security UserID="S-1-5-20" />
</System>
<EventData>
<Data Name="id">
</Data>
<Data Name="type">ServiceCommunicationClient</Data>
<Data Name="text">Error While Sending Message : FABRIC_E_TIMEOUT</Data>
</EventData>
</Event>
<Endpoint Name="Actor1ActorServiceEndpoint" Port="0" />
[assembly: FabricTransportServiceRemotingProvider(OperationTimeoutInSeconds = 3600)]
[assembly: FabricTransportActorRemotingProvider(OperationTimeoutInSeconds = 3600)]