具有SQL Server数据库调用的多线程C#应用程序

具有SQL Server数据库调用的多线程C#应用程序,c#,sql,sql-server,multithreading,architecture,C#,Sql,Sql Server,Multithreading,Architecture,我有一个SQL Server数据库,表main中有500000条记录。还有另外三个表,分别称为child1、child2和child3。child1、child2、child3和main之间的多对多关系通过三个关系表实现:main\u child1\u关系、main\u child2\u关系,以及main\u child3\u关系。我需要读取main中的记录,更新main,还需要在关系表中插入新行以及在子表中插入新记录。子表中的记录具有唯一性约束,因此实际计算(CalculateDetails)

我有一个SQL Server数据库,表
main
中有500000条记录。还有另外三个表,分别称为
child1
child2
child3
child1
child2
child3
main
之间的多对多关系通过三个关系表实现:
main\u child1\u关系
main\u child2\u关系
,以及
main\u child3\u关系
。我需要读取
main
中的记录,更新
main
,还需要在关系表中插入新行以及在子表中插入新记录。子表中的记录具有唯一性约束,因此实际计算(CalculateDetails)的伪代码类似于:

for each record in main
{
   find its child1 like qualities
   for each one of its child1 qualities
   {
      find the record in child1 that matches that quality
      if found
      {
          add a record to main_child1_relationship to connect the two records
      }
      else
      {
          create a new record in child1 for the quality mentioned
          add a record to main_child1_relationship to connect the two records
      }
   }
   ...repeat the above for child2
   ...repeat the above for child3 
}
这可以作为一个单线程应用程序使用。但是它太慢了。C#中的处理任务非常繁重,耗时太长。我想把它变成一个多线程的应用程序

最好的方法是什么?我们正在使用LINQtoSQL

到目前为止,我的方法是为
main
中的每批记录创建一个新的
DataContext
对象,并使用
ThreadPool.QueueUserWorkItem
来处理它。然而,这些批处理彼此都在步履蹒跚,因为一个线程添加一条记录,然后下一个线程尝试添加同一条记录,然后。。。我得到了各种有趣的SQL Server死锁

代码如下:

    int skip = 0;
    List<int> thisBatch;
    Queue<List<int>> allBatches = new Queue<List<int>>();
    do
    {
        thisBatch = allIds
                .Skip(skip)
                .Take(numberOfRecordsToPullFromDBAtATime).ToList();
        allBatches.Enqueue(thisBatch);
        skip += numberOfRecordsToPullFromDBAtATime;

    } while (thisBatch.Count() > 0);

    while (allBatches.Count() > 0)
    {
        RRDataContext rrdc = new RRDataContext();

        var currentBatch = allBatches.Dequeue();
        lock (locker)  
        {
            runningTasks++;
        }
        System.Threading.ThreadPool.QueueUserWorkItem(x =>
                    ProcessBatch(currentBatch, rrdc));

        lock (locker) 
        {
            while (runningTasks > MAX_NUMBER_OF_THREADS)
            {
                 Monitor.Wait(locker);
                 UpdateGUI();
            }
        }
    }
int skip=0;
列出此批次;
Queue ALLBACKS=新队列();
做
{
thisBatch=allIds
.Skip(Skip)
.Take(numberofrecordstopullfromdbatime.ToList();
AllBatchs.Enqueue(此批次);
skip+=RecordstopullFromDbatime的数量;
}while(thisBatch.Count()>0);
while(allBatches.Count()>0)
{
RRDataContext rrdc=新的RRDataContext();
var currentBatch=allBatches.Dequeue();
锁(储物柜)
{
runningTasks++;
}
System.Threading.ThreadPool.QueueUserWorkItem(x=>
ProcessBatch(currentBatch,rrdc));
锁(储物柜)
{
while(运行任务>最大线程数)
{
监视器。等待(储物柜);
UpdateGUI();
}
}
}
下面是ProcessBatch:

    private static void ProcessBatch( 
        List<int> currentBatch, RRDataContext rrdc)
    {
        var topRecords = GetTopRecords(rrdc, currentBatch);
        CalculateDetails(rrdc, topRecords);
        rrdc.Dispose();

        lock (locker)
        {
            runningTasks--;
            Monitor.Pulse(locker);
        };
    }
private static void ProcessBatch(
列表currentBatch,RRDataContext(rrdc)
{
var topRecords=GetTopRecords(rrdc,currentBatch);
计算的详细信息(rrdc、topRecords);
rrdc.Dispose();
锁(储物柜)
{
运行任务--;
监视器。脉冲(锁定器);
};
}

私有静态列表GetToRecords(RecipeRelationshipsDataContext rrdc,
列出此批次)
{
列出最佳记录;
topRecords=rrdc.Records
.Where(x=>thisBatch.Contains(x.Id))
.OrderBy(x=>x.OrderByMe.ToList();
归还记录;
}
CalculateDetails
最好用顶部的伪代码来解释

我想一定有更好的办法。请帮忙。非常感谢

概述 问题的根源在于L2S DataContext与实体框架的ObjectContext一样,不是线程安全的。如中所述,.NET ORM解决方案中对异步操作的支持在.NET 4.0中仍处于挂起状态;您必须推出自己的解决方案,正如您所发现的,当您的框架假设为单线程时,这并不总是容易做到的

我将借此机会注意到L2S是建立在ADO.NET之上的,ADO.NET本身完全支持异步操作——就我个人而言,我更愿意直接处理底层并自己编写SQL,只是为了确保我完全理解网络上发生的事情

SQL Server解决方案? 话虽如此,我不得不问——这是否是一个C#解决方案?如果您可以用一组insert/update语句组成您的解决方案,您只需直接通过SQL发送,线程和性能问题就会消失。*在我看来,您的问题与要进行的实际数据转换无关,而是围绕着从.NET执行这些转换。如果将.NET从等式中删除,您的任务将变得更简单。毕竟,最好的解决方案通常是让您编写最少的代码,对吗?;)

即使您的更新/插入逻辑不能以严格设置的关系方式表示,SQL Server也有一个内置的机制来迭代记录和执行逻辑——尽管它们在许多用例中都被恶意攻击,但游标实际上可能适合您的任务

如果这是一项必须重复执行的任务,那么将其编码为存储过程将使您受益匪浅

*当然,长时间运行的SQL也带来了自己的问题,比如锁升级和索引使用,您必须应对这些问题

C#解决方案 当然,在SQL中这样做可能是不可能的——例如,您的代码的决定可能取决于来自其他地方的数据,或者您的项目有严格的“不允许使用SQL”约定。您提到了一些典型的多线程错误,但是没有看到您的代码,我无法具体地帮助您解决这些问题

从C#执行此操作显然是可行的,但您需要处理这样一个事实,即您所做的每一次呼叫都会存在固定的延迟。通过使用池连接、启用多个活动结果集以及使用异步开始/结束方法执行查询,可以减轻网络延迟的影响。即使有了所有这些,您仍然必须接受将数据从SQL Server传送到服务器是有成本的
    private static List<Record> GetTopRecords(RecipeRelationshipsDataContext rrdc, 
                                              List<int> thisBatch)
    {
        List<Record> topRecords;

        topRecords = rrdc.Records
                    .Where(x => thisBatch.Contains(x.Id))
                    .OrderBy(x => x.OrderByMe).ToList();
        return topRecords;
    }
private IList<int> GetMainIds()
{
    using (var context = new MyDataContext())
        return context.Main.Select(m => m.Id).ToList();
}

private void FixUpSingleRecord(int mainRecordId)
{
    using (var localContext = new MyDataContext())
    {
        var main = localContext.Main.FirstOrDefault(m => m.Id == mainRecordId);

        if (main == null)
            return;

        foreach (var childOneQuality in main.ChildOneQualities)
        {
            // If child one is not found, create it
            // Create the relationship if needed
        }

        // Repeat for ChildTwo and ChildThree

        localContext.SaveChanges();
    }
}

public void FixUpMain()
{
    var ids = GetMainIds();
    foreach (var id in ids)
    {
        var localId = id; // Avoid closing over an iteration member
        ThreadPool.QueueUserWorkItem(delegate { FixUpSingleRecord(id) });
    }
}
BEGIN TRAN
DECLARE @mutex_result int;
EXEC @mutex_result = sp_getapplock @Resource = 'CheckSetFileTransferLock',
 @LockMode = 'Exclusive';

IF ( @mutex_result < 0)
BEGIN
    ROLLBACK TRAN

END

-- do some stuff

EXEC @mutex_result = sp_releaseapplock @Resource = 'CheckSetFileTransferLock'
COMMIT TRAN  
using (var dc = new TestDataContext())
{
    // Get all the ids of interest.
    // I assume you mark successfully updated rows in some way
    // in the update transaction.
    List<int> ids = dc.TestItems.Where(...).Select(item => item.Id).ToList();

    var problematicIds = new List<ErrorType>();

    // Either allow the TaskParallel library to select what it considers
    // as the optimum degree of parallelism by omitting the 
    // ParallelOptions parameter, or specify what you want.
    Parallel.ForEach(ids, new ParallelOptions {MaxDegreeOfParallelism = 8},
                        id => CalculateDetails(id, problematicIds));
}
private static void CalculateDetails(int id, List<ErrorType> problematicIds)
{
    try
    {
        // Handle deadlocks
        DeadlockRetryHelper.Execute(() => CalculateDetails(id));
    }
    catch (Exception e)
    {
        // Too many deadlock retries (or other exception). 
        // Record so we can diagnose problem or retry later
        problematicIds.Add(new ErrorType(id, e));
    }
}
private static void CalculateDetails(int id)
{
    // Creating a new DeviceContext is not expensive.
    // No need to create outside of this method.
    using (var dc = new TestDataContext())
    {
        // TODO: adjust IsolationLevel to minimize deadlocks
        // If you don't need to change the isolation level 
        // then you can remove the TransactionScope altogether
        using (var scope = new TransactionScope(
            TransactionScopeOption.Required,
            new TransactionOptions {IsolationLevel = IsolationLevel.Serializable}))
        {
            TestItem item = dc.TestItems.Single(i => i.Id == id);

            // work done here

            dc.SubmitChanges();
            scope.Complete();
        }
    }
}
public static class DeadlockRetryHelper
{
    private const int MaxRetries = 4;
    private const int SqlDeadlock = 1205;

    public static void Execute(Action action, int maxRetries = MaxRetries)
    {
        if (HasAmbientTransaction())
        {
            // Deadlock blows out containing transaction
            // so no point retrying if already in tx.
            action();
        }

        int retries = 0;

        while (retries < maxRetries)
        {
            try
            {
                action();
                return;
            }
            catch (Exception e)
            {
                if (IsSqlDeadlock(e))
                {
                    retries++;
                    // Delay subsequent retries - not sure if this helps or not
                    Thread.Sleep(100 * retries);
                }
                else
                {
                    throw;
                }
            }
        }

        action();
    }

    private static bool HasAmbientTransaction()
    {
        return Transaction.Current != null;
    }

    private static bool IsSqlDeadlock(Exception exception)
    {
        if (exception == null)
        {
            return false;
        }

        var sqlException = exception as SqlException;

        if (sqlException != null && sqlException.Number == SqlDeadlock)
        {
            return true;
        }

        if (exception.InnerException != null)
        {
            return IsSqlDeadlock(exception.InnerException);
        }

        return false;
    }
}
CREATE TABLE closet (id int PRIMARY KEY, xmldoc ntext) 
CREATE TABLE shoe(id int PRIMARY KEY IDENTITY, color nvarchar(20))
CREATE TABLE closet_shoe_relationship (
    closet_id int REFERENCES closet(id),
    shoe_id int REFERENCES shoe(id)
)
INSERT INTO closet(id, xmldoc) VALUES (1, '<ROOT><shoe><color>blue</color></shoe></ROOT>')
INSERT INTO closet(id, xmldoc) VALUES (2, '<ROOT><shoe><color>red</color></shoe></ROOT>')
INSERT INTO shoe(color) SELECT DISTINCT CAST(CAST(xmldoc AS xml).query('//shoe/color/text()') AS nvarchar) AS color from closet
INSERT INTO closet_shoe_relationship(closet_id, shoe_id) SELECT closet.id, shoe.id FROM shoe JOIN closet ON CAST(CAST(closet.xmldoc AS xml).query('//shoe/color/text()') AS nvarchar) = shoe.color
INSERT INTO shoe(color)
    SELECT DISTINCT CAST(xmldoc.query('//shoe/color/text()') AS nvarchar)
    FROM closet
INSERT INTO closet_shoe_relationship(closet_id, shoe_id)
    SELECT closet.id, shoe.id
    FROM shoe JOIN closet
        ON CAST(xmldoc.query('//shoe/color/text()') AS nvarchar) = shoe.color