具有SQL Server数据库调用的多线程C#应用程序
我有一个SQL Server数据库,表具有SQL Server数据库调用的多线程C#应用程序,c#,sql,sql-server,multithreading,architecture,C#,Sql,Sql Server,Multithreading,Architecture,我有一个SQL Server数据库,表main中有500000条记录。还有另外三个表,分别称为child1、child2和child3。child1、child2、child3和main之间的多对多关系通过三个关系表实现:main\u child1\u关系、main\u child2\u关系,以及main\u child3\u关系。我需要读取main中的记录,更新main,还需要在关系表中插入新行以及在子表中插入新记录。子表中的记录具有唯一性约束,因此实际计算(CalculateDetails)
main
中有500000条记录。还有另外三个表,分别称为child1
、child2
和child3
。child1
、child2
、child3
和main
之间的多对多关系通过三个关系表实现:main\u child1\u关系
、main\u child2\u关系
,以及main\u child3\u关系
。我需要读取main
中的记录,更新main
,还需要在关系表中插入新行以及在子表中插入新记录。子表中的记录具有唯一性约束,因此实际计算(CalculateDetails)的伪代码类似于:
for each record in main
{
find its child1 like qualities
for each one of its child1 qualities
{
find the record in child1 that matches that quality
if found
{
add a record to main_child1_relationship to connect the two records
}
else
{
create a new record in child1 for the quality mentioned
add a record to main_child1_relationship to connect the two records
}
}
...repeat the above for child2
...repeat the above for child3
}
这可以作为一个单线程应用程序使用。但是它太慢了。C#中的处理任务非常繁重,耗时太长。我想把它变成一个多线程的应用程序
最好的方法是什么?我们正在使用LINQtoSQL
到目前为止,我的方法是为main
中的每批记录创建一个新的DataContext
对象,并使用ThreadPool.QueueUserWorkItem
来处理它。然而,这些批处理彼此都在步履蹒跚,因为一个线程添加一条记录,然后下一个线程尝试添加同一条记录,然后。。。我得到了各种有趣的SQL Server死锁
代码如下:
int skip = 0;
List<int> thisBatch;
Queue<List<int>> allBatches = new Queue<List<int>>();
do
{
thisBatch = allIds
.Skip(skip)
.Take(numberOfRecordsToPullFromDBAtATime).ToList();
allBatches.Enqueue(thisBatch);
skip += numberOfRecordsToPullFromDBAtATime;
} while (thisBatch.Count() > 0);
while (allBatches.Count() > 0)
{
RRDataContext rrdc = new RRDataContext();
var currentBatch = allBatches.Dequeue();
lock (locker)
{
runningTasks++;
}
System.Threading.ThreadPool.QueueUserWorkItem(x =>
ProcessBatch(currentBatch, rrdc));
lock (locker)
{
while (runningTasks > MAX_NUMBER_OF_THREADS)
{
Monitor.Wait(locker);
UpdateGUI();
}
}
}
int skip=0;
列出此批次;
Queue ALLBACKS=新队列();
做
{
thisBatch=allIds
.Skip(Skip)
.Take(numberofrecordstopullfromdbatime.ToList();
AllBatchs.Enqueue(此批次);
skip+=RecordstopullFromDbatime的数量;
}while(thisBatch.Count()>0);
while(allBatches.Count()>0)
{
RRDataContext rrdc=新的RRDataContext();
var currentBatch=allBatches.Dequeue();
锁(储物柜)
{
runningTasks++;
}
System.Threading.ThreadPool.QueueUserWorkItem(x=>
ProcessBatch(currentBatch,rrdc));
锁(储物柜)
{
while(运行任务>最大线程数)
{
监视器。等待(储物柜);
UpdateGUI();
}
}
}
下面是ProcessBatch:
private static void ProcessBatch(
List<int> currentBatch, RRDataContext rrdc)
{
var topRecords = GetTopRecords(rrdc, currentBatch);
CalculateDetails(rrdc, topRecords);
rrdc.Dispose();
lock (locker)
{
runningTasks--;
Monitor.Pulse(locker);
};
}
private static void ProcessBatch(
列表currentBatch,RRDataContext(rrdc)
{
var topRecords=GetTopRecords(rrdc,currentBatch);
计算的详细信息(rrdc、topRecords);
rrdc.Dispose();
锁(储物柜)
{
运行任务--;
监视器。脉冲(锁定器);
};
}
及
私有静态列表GetToRecords(RecipeRelationshipsDataContext rrdc,
列出此批次)
{
列出最佳记录;
topRecords=rrdc.Records
.Where(x=>thisBatch.Contains(x.Id))
.OrderBy(x=>x.OrderByMe.ToList();
归还记录;
}
CalculateDetails
最好用顶部的伪代码来解释
我想一定有更好的办法。请帮忙。非常感谢 概述
问题的根源在于L2S DataContext与实体框架的ObjectContext一样,不是线程安全的。如中所述,.NET ORM解决方案中对异步操作的支持在.NET 4.0中仍处于挂起状态;您必须推出自己的解决方案,正如您所发现的,当您的框架假设为单线程时,这并不总是容易做到的
我将借此机会注意到L2S是建立在ADO.NET之上的,ADO.NET本身完全支持异步操作——就我个人而言,我更愿意直接处理底层并自己编写SQL,只是为了确保我完全理解网络上发生的事情
SQL Server解决方案?
话虽如此,我不得不问——这是否是一个C#解决方案?如果您可以用一组insert/update语句组成您的解决方案,您只需直接通过SQL发送,线程和性能问题就会消失。*在我看来,您的问题与要进行的实际数据转换无关,而是围绕着从.NET执行这些转换。如果将.NET从等式中删除,您的任务将变得更简单。毕竟,最好的解决方案通常是让您编写最少的代码,对吗?;)
即使您的更新/插入逻辑不能以严格设置的关系方式表示,SQL Server也有一个内置的机制来迭代记录和执行逻辑——尽管它们在许多用例中都被恶意攻击,但游标实际上可能适合您的任务
如果这是一项必须重复执行的任务,那么将其编码为存储过程将使您受益匪浅
*当然,长时间运行的SQL也带来了自己的问题,比如锁升级和索引使用,您必须应对这些问题
C#解决方案
当然,在SQL中这样做可能是不可能的——例如,您的代码的决定可能取决于来自其他地方的数据,或者您的项目有严格的“不允许使用SQL”约定。您提到了一些典型的多线程错误,但是没有看到您的代码,我无法具体地帮助您解决这些问题
从C#执行此操作显然是可行的,但您需要处理这样一个事实,即您所做的每一次呼叫都会存在固定的延迟。通过使用池连接、启用多个活动结果集以及使用异步开始/结束方法执行查询,可以减轻网络延迟的影响。即使有了所有这些,您仍然必须接受将数据从SQL Server传送到服务器是有成本的
private static List<Record> GetTopRecords(RecipeRelationshipsDataContext rrdc,
List<int> thisBatch)
{
List<Record> topRecords;
topRecords = rrdc.Records
.Where(x => thisBatch.Contains(x.Id))
.OrderBy(x => x.OrderByMe).ToList();
return topRecords;
}
private IList<int> GetMainIds()
{
using (var context = new MyDataContext())
return context.Main.Select(m => m.Id).ToList();
}
private void FixUpSingleRecord(int mainRecordId)
{
using (var localContext = new MyDataContext())
{
var main = localContext.Main.FirstOrDefault(m => m.Id == mainRecordId);
if (main == null)
return;
foreach (var childOneQuality in main.ChildOneQualities)
{
// If child one is not found, create it
// Create the relationship if needed
}
// Repeat for ChildTwo and ChildThree
localContext.SaveChanges();
}
}
public void FixUpMain()
{
var ids = GetMainIds();
foreach (var id in ids)
{
var localId = id; // Avoid closing over an iteration member
ThreadPool.QueueUserWorkItem(delegate { FixUpSingleRecord(id) });
}
}
BEGIN TRAN
DECLARE @mutex_result int;
EXEC @mutex_result = sp_getapplock @Resource = 'CheckSetFileTransferLock',
@LockMode = 'Exclusive';
IF ( @mutex_result < 0)
BEGIN
ROLLBACK TRAN
END
-- do some stuff
EXEC @mutex_result = sp_releaseapplock @Resource = 'CheckSetFileTransferLock'
COMMIT TRAN
using (var dc = new TestDataContext())
{
// Get all the ids of interest.
// I assume you mark successfully updated rows in some way
// in the update transaction.
List<int> ids = dc.TestItems.Where(...).Select(item => item.Id).ToList();
var problematicIds = new List<ErrorType>();
// Either allow the TaskParallel library to select what it considers
// as the optimum degree of parallelism by omitting the
// ParallelOptions parameter, or specify what you want.
Parallel.ForEach(ids, new ParallelOptions {MaxDegreeOfParallelism = 8},
id => CalculateDetails(id, problematicIds));
}
private static void CalculateDetails(int id, List<ErrorType> problematicIds)
{
try
{
// Handle deadlocks
DeadlockRetryHelper.Execute(() => CalculateDetails(id));
}
catch (Exception e)
{
// Too many deadlock retries (or other exception).
// Record so we can diagnose problem or retry later
problematicIds.Add(new ErrorType(id, e));
}
}
private static void CalculateDetails(int id)
{
// Creating a new DeviceContext is not expensive.
// No need to create outside of this method.
using (var dc = new TestDataContext())
{
// TODO: adjust IsolationLevel to minimize deadlocks
// If you don't need to change the isolation level
// then you can remove the TransactionScope altogether
using (var scope = new TransactionScope(
TransactionScopeOption.Required,
new TransactionOptions {IsolationLevel = IsolationLevel.Serializable}))
{
TestItem item = dc.TestItems.Single(i => i.Id == id);
// work done here
dc.SubmitChanges();
scope.Complete();
}
}
}
public static class DeadlockRetryHelper
{
private const int MaxRetries = 4;
private const int SqlDeadlock = 1205;
public static void Execute(Action action, int maxRetries = MaxRetries)
{
if (HasAmbientTransaction())
{
// Deadlock blows out containing transaction
// so no point retrying if already in tx.
action();
}
int retries = 0;
while (retries < maxRetries)
{
try
{
action();
return;
}
catch (Exception e)
{
if (IsSqlDeadlock(e))
{
retries++;
// Delay subsequent retries - not sure if this helps or not
Thread.Sleep(100 * retries);
}
else
{
throw;
}
}
}
action();
}
private static bool HasAmbientTransaction()
{
return Transaction.Current != null;
}
private static bool IsSqlDeadlock(Exception exception)
{
if (exception == null)
{
return false;
}
var sqlException = exception as SqlException;
if (sqlException != null && sqlException.Number == SqlDeadlock)
{
return true;
}
if (exception.InnerException != null)
{
return IsSqlDeadlock(exception.InnerException);
}
return false;
}
}
CREATE TABLE closet (id int PRIMARY KEY, xmldoc ntext)
CREATE TABLE shoe(id int PRIMARY KEY IDENTITY, color nvarchar(20))
CREATE TABLE closet_shoe_relationship (
closet_id int REFERENCES closet(id),
shoe_id int REFERENCES shoe(id)
)
INSERT INTO closet(id, xmldoc) VALUES (1, '<ROOT><shoe><color>blue</color></shoe></ROOT>')
INSERT INTO closet(id, xmldoc) VALUES (2, '<ROOT><shoe><color>red</color></shoe></ROOT>')
INSERT INTO shoe(color) SELECT DISTINCT CAST(CAST(xmldoc AS xml).query('//shoe/color/text()') AS nvarchar) AS color from closet
INSERT INTO closet_shoe_relationship(closet_id, shoe_id) SELECT closet.id, shoe.id FROM shoe JOIN closet ON CAST(CAST(closet.xmldoc AS xml).query('//shoe/color/text()') AS nvarchar) = shoe.color
INSERT INTO shoe(color)
SELECT DISTINCT CAST(xmldoc.query('//shoe/color/text()') AS nvarchar)
FROM closet
INSERT INTO closet_shoe_relationship(closet_id, shoe_id)
SELECT closet.id, shoe.id
FROM shoe JOIN closet
ON CAST(xmldoc.query('//shoe/color/text()') AS nvarchar) = shoe.color