从C#调用SQL Server时知道何时重试或失败?

从C#调用SQL Server时知道何时重试或失败?,c#,.net,sql-server,C#,.net,Sql Server,我有一个C#应用程序,它从托管在有点不稳定的环境中的SQL Server获取数据。我无法解决环境问题,因此我需要尽可能优雅地处理它们 为此,我希望重试由于基础结构故障而导致的操作,如网络故障、SQL服务器因重新启动而脱机、查询超时等。同时,如果查询因逻辑错误而失败,我不希望重试查询。我只是想让那些人把例外情况告诉客户 我的问题是:区分环境问题(连接丢失、超时)和其他类型的异常(如即使环境稳定也会发生的逻辑错误)的最佳方法是什么 在C#中是否有一种常用的模式来处理类似的事情?例如,我是否可以在Sq

我有一个C#应用程序,它从托管在有点不稳定的环境中的SQL Server获取数据。我无法解决环境问题,因此我需要尽可能优雅地处理它们

为此,我希望重试由于基础结构故障而导致的操作,如网络故障、SQL服务器因重新启动而脱机、查询超时等。同时,如果查询因逻辑错误而失败,我不希望重试查询。我只是想让那些人把例外情况告诉客户

我的问题是:区分环境问题(连接丢失、超时)和其他类型的异常(如即使环境稳定也会发生的逻辑错误)的最佳方法是什么

在C#中是否有一种常用的模式来处理类似的事情?例如,我是否可以在SqlConnection对象上检查属性以检测失败的连接?如果没有,解决这个问题的最佳方法是什么

无论如何,我的代码没有什么特别之处:

using (SqlConnection connection = new SqlConnection(myConnectionString))
using (SqlCommand command = connection.CreateCommand())
{
  command.CommandText = mySelectCommand;
  connection.Open();

  using (SqlDataReader reader = command.ExecuteReader())
  {
    while (reader.Read())
    {
      // Do something with the returned data.
    }
  }
}

我不知道有什么标准,但这里有一个我通常认为是可检索的
Sql Server
异常列表,其中还添加了DTC:

catch (SqlException sqlEx)
{
    canRetry = ((sqlEx.Number == 1205) // 1205 = Deadlock
        || (sqlEx.Number == -2) // -2 = TimeOut
        || (sqlEx.Number == 3989) // 3989 = New request is not allowed to start because it should come with valid transaction descriptor
        || (sqlEx.Number == 3965) // 3965 = The PROMOTE TRANSACTION request failed because there is no local transaction active.
        || (sqlEx.Number == 3919) // 3919 Cannot enlist in the transaction because the transaction has already been committed or rolled back
        || (sqlEx.Number == 3903)); // The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION.
}
关于重试,建议在重试之间添加随机ish延迟,以减少相同的2个事务再次死锁的可能性


对于一些与
DTC
相关的错误,可能需要删除连接(或者最坏的情况是,
SqlClient.SqlConnection.ClearAllPools()
)-否则会将无用的连接返回到池中。

本着将关注点分开的精神,我在本例中描绘了三个逻辑层

  • 应用程序层,称为“片状依赖处理程序”层
  • “片状依赖处理程序”层,它调用数据访问层
  • 数据访问层,它完全不知道片状
  • 所有用于重试的逻辑都将位于该处理程序层中,以便不使用与数据库通信以外的逻辑污染数据访问层。(因此,您的数据访问代码不需要更改。如果逻辑上需要更改以获得新功能,也不需要担心“片状”)

    重试模式可以基于捕获计数器循环中的特定异常。(计数器只是为了防止无限次重试。)类似这样:

    public SomeReturnValue GetSomeData(someIdentifier)
    {
        var tries = 0;
        while (tries < someConfiguredMaximum)
        {
            try
            {
                tries++;
                return someDataAccessObject.GetSomeData(someIdentifier);
            }
            catch (SqlException e)
            {
                someLogger.LogError(e);
                // maybe wait for some number of milliseconds?  make the method async if possible
            }
        }
        throw new CustomException("Maximum number of tries has been reached.");
    }
    
    Try(
        () => new SqlConnection(connectionString),
        cmd => {
                 cmd.CommandText = "SELECT * FROM master.sys.messages";
                 using (var reader = cmd.ExecuteReader()) {
                     // Do stuff
             }
        });
    
    public SomeReturnValue GetSomeData(someIdentifier)
    {
    var=0;
    while(尝试
    这将循环一些配置的次数,重新尝试,直到它工作或达到最大值。在该最大值之后,将引发一个自定义异常,以供应用程序处理。通过检查捕获的特定
    SqlException
    ,可以进一步微调异常处理。根据错误消息,您可能希望继续循环或抛出
    CustomException


    您可以通过捕获其他异常类型、检查这些异常类型等来进一步细化此逻辑。这里的要点是,此职责与应用程序中的特定逻辑层隔离,对其他层尽可能透明。理想情况下,处理程序层和数据访问层实现相同的接口。这样,如果您将代码移动到一个更稳定的环境中,并且不再需要处理程序层,那么在不需要对应用程序层进行任何更改的情况下将其删除将是微不足道的。

    一个
    SqlException
    (may)封装多个SQL Server错误。您可以使用
    Errors
    属性对它们进行迭代。每个错误都是
    SqlError

    foreach (SqlError error in exception.Errors)
    
    每个
    SqlError
    都有一个
    Class
    属性,您可以使用该属性大致确定是否可以重试(如果必须重新创建连接,则可以重试)。发件人:

    • <10是针对您传递的信息中的错误,如果您没有更正输入,则(可能)无法重试
    • 从11到16都是“由用户生成”的,如果用户首先不更正输入,则可能同样无法执行任何操作。请注意,类16包含许多临时错误,类13用于死锁(多亏了EvZ),所以如果逐个处理这些类,您可能会将它们排除在外
    • Class
      从17到24是通用硬件/软件错误,您可以重试。当
      Class
      为20或更高时,您也必须重新创建连接。22和23可能是严重的硬件/软件错误,24表示媒体错误(用户应该得到警告,但如果只是“临时”错误,您可以重试)
    您可以找到每个类的更详细描述

    一般来说,如果您使用它们的类处理错误,您不需要确切地知道每个错误(使用
    error.Number
    属性或
    exception.Number
    ,这只是列表中第一个
    SqlError
    的快捷方式)。这有一个缺点,当它没有用处(或者错误无法恢复)时,您可能会重试。我建议采用两步方法:

    • 检查已知错误代码(使用
      从master.sys.messages
      中选择*列出错误代码),查看您想要处理的内容(知道如何处理)。该视图包含所有受支持语言的消息,因此您可能需要通过
      msglagid
      列(对于e
      private static readonly int[] RetriableClasses = { 13, 16, 17, 18, 19, 20, 21, 22, 24 };
      
      private static bool CanRetry(SqlError error) {
          // Use this switch if you want to handle only well-known errors,
          // remove it if you want to always retry. A "blacklist" approach may
          // also work: return false when you're sure you can't recover from one
          // error and rely on Class for anything else.
          switch (error.Number) {
              // Handle well-known error codes, 
          }
      
          // Handle unknown errors with severity 21 or less. 22 or more
          // indicates a serious error that need to be manually fixed.
          // 24 indicates media errors. They're serious errors (that should
          // be also notified) but we may retry...
          return RetriableClasses.Contains(error.Class); // LINQ...
      }
      
      public static void Try(
          Func<SqlConnection> connectionFactory,
          Action<SqlCommand> performer);
      
      Try(
          () => new SqlConnection(connectionString),
          cmd => {
                   cmd.CommandText = "SELECT * FROM master.sys.messages";
                   using (var reader = cmd.ExecuteReader()) {
                       // Do stuff
               }
          });
      
      /// <summary>
      /// Helps to extract useful information from SQLExceptions, particularly in SQL Azure
      /// </summary>
      public class SqlExceptionDetails
      {
          public ResourcesThrottled SeriouslyExceededResources { get; private set; }
          public ResourcesThrottled SlightlyExceededResources { get; private set; }
          public OperationsThrottled OperationsThrottled { get; private set; }
          public IList<SqlErrorCode> Errors { get; private set; }
          public string ThrottlingMessage { get; private set; }
      
          public bool ShouldRetry { get; private set; }
          public bool ShouldRetryImmediately { get; private set; }
      
          private SqlExceptionDetails()
          {
              this.ShouldRetryImmediately = false;
              this.ShouldRetry = true;
              this.SeriouslyExceededResources = ResourcesThrottled.None;
              this.SlightlyExceededResources = ResourcesThrottled.None;
              this.OperationsThrottled = OperationsThrottled.None;
              Errors = new List<SqlErrorCode>();
          }
      
          public SqlExceptionDetails(SqlException exception) :this(exception.Errors.Cast<SqlError>())
          {
          }
      
          public SqlExceptionDetails(IEnumerable<SqlError> errors) : this()
          {
              List<ISqlError> errorWrappers = (from err in errors
                                               select new SqlErrorWrapper(err)).Cast<ISqlError>().ToList();
              this.ParseErrors(errorWrappers);
          }
      
          public SqlExceptionDetails(IEnumerable<ISqlError> errors) : this()
          {
              ParseErrors(errors);
          }
      
          private void ParseErrors(IEnumerable<ISqlError> errors)
          {
              foreach (ISqlError error in errors)
              {
                  SqlErrorCode code = GetSqlErrorCodeFromInt(error.Number);
                  this.Errors.Add(code);
      
                  switch (code)
                  {
                      case SqlErrorCode.ServerBusy:
                          ParseServerBusyError(error);
                          break;
                      case SqlErrorCode.ConnectionFailed:
                          //This is a very non-specific error, can happen for almost any reason
                          //so we can't make any conclusions from it
                          break;
                      case SqlErrorCode.DatabaseUnavailable:
                          ShouldRetryImmediately = false;
                          break;
                      case SqlErrorCode.EncryptionNotSupported:
                          //this error code is sometimes sent by the client when it shouldn't be
                          //Therefore we need to retry it, even though it seems this problem wouldn't fix itself
                          ShouldRetry = true;
                          ShouldRetryImmediately = true;
                          break;
                      case SqlErrorCode.DatabaseWorkerThreadThrottling:
                      case SqlErrorCode.ServerWorkerThreadThrottling:
                          ShouldRetry = true;
                          ShouldRetryImmediately = false;
                          break;
      
      
                      //The following errors are probably not going to resolved in 10 seconds
                      //They're mostly related to poor query design, broken DB configuration, or too much data
                      case SqlErrorCode.ExceededDatabaseSizeQuota:
                      case SqlErrorCode.TransactionRanTooLong:
                      case SqlErrorCode.TooManyLocks:
                      case SqlErrorCode.ExcessiveTempDBUsage:
                      case SqlErrorCode.ExcessiveMemoryUsage:
                      case SqlErrorCode.ExcessiveTransactionLogUsage:
                      case SqlErrorCode.BlockedByFirewall:
                      case SqlErrorCode.TooManyFirewallRules:
                      case SqlErrorCode.CannotOpenServer:
                      case SqlErrorCode.LoginFailed:
                      case SqlErrorCode.FeatureNotSupported:
                      case SqlErrorCode.StoredProcedureNotFound:
                      case SqlErrorCode.StringOrBinaryDataWouldBeTruncated:
                          this.ShouldRetry = false;
                          break;
                  }
              }
      
              if (this.ShouldRetry && Errors.Count == 1)
              {
                  SqlErrorCode code = this.Errors[0];
                  if (code == SqlErrorCode.TransientServerError)
                  {
                      this.ShouldRetryImmediately = true;
                  }
              }
      
              if (IsResourceThrottled(ResourcesThrottled.Quota) ||
                  IsResourceThrottled(ResourcesThrottled.Disabled))
              {
                  this.ShouldRetry = false;
              }
      
              if (!this.ShouldRetry)
              {
                  this.ShouldRetryImmediately = false;
              }
      
              SetThrottlingMessage();
          }
      
          private void SetThrottlingMessage()
          {
              if (OperationsThrottled == Sql.OperationsThrottled.None)
              {
                  ThrottlingMessage = "No throttling";
              }
              else
              {
                  string opsThrottled = OperationsThrottled.ToString();
                  string seriousExceeded = SeriouslyExceededResources.ToString();
                  string slightlyExceeded = SlightlyExceededResources.ToString();
      
                  ThrottlingMessage = "SQL Server throttling encountered. Operations throttled: " + opsThrottled
                              + ", Resources Seriously Exceeded: " + seriousExceeded
                              + ", Resources Slightly Exceeded: " + slightlyExceeded;
              }
          }
      
          private bool IsResourceThrottled(ResourcesThrottled resource)
          {
              return ((this.SeriouslyExceededResources & resource) > 0 ||
                      (this.SlightlyExceededResources & resource) > 0);
          }
      
          private SqlErrorCode GetSqlErrorCodeFromInt(int p)
          {
              switch (p)
              {
                  case 40014:
                  case 40054:
                  case 40133:
                  case 40506:
                  case 40507:
                  case 40508:
                  case 40512:
                  case 40516:
                  case 40520:
                  case 40521:
                  case 40522:
                  case 40523:
                  case 40524:
                  case 40525:
                  case 40526:
                  case 40527:
                  case 40528:
                  case 40606:
                  case 40607:
                  case 40636:
                      return SqlErrorCode.FeatureNotSupported;
              }
      
              try
              {
                  return (SqlErrorCode)p;
              }
              catch
              {
                  return SqlErrorCode.Unknown;
              }
          }
      
          /// <summary>
          /// Parse out the reason code from a ServerBusy error. 
          /// </summary>
          /// <remarks>Basic idea extracted from http://msdn.microsoft.com/en-us/library/gg491230.aspx
          /// </remarks>
          /// <param name="error"></param>
          private void ParseServerBusyError(ISqlError error)
          {
              int idx = error.Message.LastIndexOf("Code:");
              if (idx < 0)
              {
                  return;
              }
      
              string reasonCodeString = error.Message.Substring(idx + "Code:".Length);
              int reasonCode;
              if (!int.TryParse(reasonCodeString, out reasonCode))
              {
                  return;
              }
      
              int opsThrottledInt = (reasonCode & 3);
              this.OperationsThrottled = (OperationsThrottled)(Math.Max((int)OperationsThrottled, opsThrottledInt));
      
      
              int slightResourcesMask = reasonCode >> 8;
              int seriousResourcesMask = reasonCode >> 16;
              foreach (ResourcesThrottled resourceType in Enum.GetValues(typeof(ResourcesThrottled)))
              {
                  if ((seriousResourcesMask & (int)resourceType) > 0)
                  {
                      this.SeriouslyExceededResources |= resourceType;
                  }
                  if ((slightResourcesMask & (int)resourceType) > 0)
                  {
                      this.SlightlyExceededResources |= resourceType;
                  }
              }
          }
      }
      
      public interface ISqlError
      {
          int Number { get; }
          string Message { get; }
      }
      
      public class SqlErrorWrapper : ISqlError
      {
          public SqlErrorWrapper(SqlError error)
          {
              this.Number = error.Number;
              this.Message = error.Message;
          }
      
          public SqlErrorWrapper()
          {
          }
      
          public int Number { get; set; }
          public string Message { get; set; }
      }
      
      /// <summary>
      /// Documents some of the ErrorCodes from SQL/SQL Azure. 
      /// I have not included all possible errors, only the ones I thought useful for modifying runtime behaviors
      /// </summary>
      /// <remarks>
      /// Comments come from: http://social.technet.microsoft.com/wiki/contents/articles/sql-azure-connection-management-in-sql-azure.aspx
      /// </remarks>
      public enum SqlErrorCode : int
      {
          /// <summary>
          /// We don't recognize the error code returned
          /// </summary>
          Unknown = 0,
      
          /// <summary>
          /// A SQL feature/function used in the query is not supported. You must fix the query before it will work.
          /// This is a rollup of many more-specific SQL errors
          /// </summary>
          FeatureNotSupported = 1,
      
          /// <summary>
          /// Probable cause is server maintenance/upgrade. Retry connection immediately.
          /// </summary>
          TransientServerError = 40197,
      
          /// <summary>
          /// The server is throttling one or more resources. Reasons may be available from other properties
          /// </summary>
          ServerBusy = 40501,
      
          /// <summary>
          /// You have reached the per-database cap on worker threads. Investigate long running transactions and reduce server load. 
          /// http://social.technet.microsoft.com/wiki/contents/articles/1541.windows-azure-sql-database-connection-management.aspx#Throttling_Limits
          /// </summary>
          DatabaseWorkerThreadThrottling = 10928,
      
          /// <summary>
          /// The per-server worker thread cap has been reached. This may be partially due to load from other databases in a shared hosting environment (eg, SQL Azure).
          /// You may be able to alleviate the problem by reducing long running transactions.
          /// http://social.technet.microsoft.com/wiki/contents/articles/1541.windows-azure-sql-database-connection-management.aspx#Throttling_Limits
          /// </summary>
          ServerWorkerThreadThrottling = 10929,
      
          ExcessiveMemoryUsage = 40553,
      
          BlockedByFirewall = 40615,
      
          /// <summary>
          /// The database has reached the maximum size configured in SQL Azure
          /// </summary>
          ExceededDatabaseSizeQuota = 40544,
      
          /// <summary>
          /// A transaction ran for too long. This timeout seems to be 24 hours.
          /// </summary>
          /// <remarks>
          /// 24 hour limit taken from http://social.technet.microsoft.com/wiki/contents/articles/sql-azure-connection-management-in-sql-azure.aspx
          /// </remarks>
          TransactionRanTooLong = 40549,
      
          TooManyLocks = 40550,
      
          ExcessiveTempDBUsage = 40551,
      
          ExcessiveTransactionLogUsage = 40552,
      
          DatabaseUnavailable = 40613,
      
          CannotOpenServer = 40532,
      
          /// <summary>
          /// SQL Azure databases can have at most 128 firewall rules defined
          /// </summary>
          TooManyFirewallRules = 40611,
      
          /// <summary>
          /// Theoretically means the DB doesn't support encryption. However, this can be indicated incorrectly due to an error in the client library. 
          /// Therefore, even though this seems like an error that won't fix itself, it's actually a retryable error.
          /// </summary>
          /// <remarks>
          /// http://social.msdn.microsoft.com/Forums/en/ssdsgetstarted/thread/e7cbe094-5b55-4b4a-8975-162d899f1d52
          /// </remarks>
          EncryptionNotSupported = 20,
      
          /// <summary>
          /// User failed to connect to the database. This is probably not recoverable.
          /// </summary>
          /// <remarks>
          /// Some good info on more-specific debugging: http://blogs.msdn.com/b/sql_protocols/archive/2006/02/21/536201.aspx
          /// </remarks>
          LoginFailed = 18456,
      
          /// <summary>
          /// Failed to connect to the database. Could be due to configuration issues, network issues, bad login... hard to tell
          /// </summary>
          ConnectionFailed = 4060,
      
          /// <summary>
          /// Client tried to call a stored procedure that doesn't exist
          /// </summary>
          StoredProcedureNotFound = 2812,
      
          /// <summary>
          /// The data supplied is too large for the column
          /// </summary>
          StringOrBinaryDataWouldBeTruncated = 8152
      }