C# 加速rtf到纯文本的转换

C# 加速rtf到纯文本的转换,c#,optimization,.net-4.0,richtextbox,C#,Optimization,.net 4.0,Richtextbox,我必须将以RTF格式保存在数据库中的大量文本更改为纯文本。我正在使用这个方法,但是我认为我发现了一个障碍(我不认为它在我的代码中,而是在.NET框架本身中) 我有以下功能 //convert RTF text to plain text public static string RtfTextToPlainText(string FormatObject) { System.Windows.Forms.RichTextBox rtfBox = new Sy

我必须将以RTF格式保存在数据库中的大量文本更改为纯文本。我正在使用这个方法,但是我认为我发现了一个障碍(我不认为它在我的代码中,而是在.NET框架本身中)

我有以下功能

    //convert RTF text to plain text
    public static string RtfTextToPlainText(string FormatObject)
    {
        System.Windows.Forms.RichTextBox rtfBox = new System.Windows.Forms.RichTextBox();
        rtfBox.Rtf = FormatObject;
        FormatObject = rtfBox.Text; //This is line 494 for later reference for the stack traces.
        rtfBox.Dispose();

        return FormatObject;
    }
它应该是完全独立的,不阻塞任何东西。我正在做的项目有数百万条记录需要处理,所以我将工作分批进行,并使用任务进行并行处理。它仍然相当慢,所以我破门而入,找到了这个代码

下面是等待任务的调用堆栈

[In a sleep, wait, or join] 
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.CreateHandle(System.Windows.Forms.CreateParams cp) + 0x242 bytes 
System.Windows.Forms.dll!System.Windows.Forms.Control.CreateHandle() + 0x2b2 bytes  
System.Windows.Forms.dll!System.Windows.Forms.TextBoxBase.CreateHandle() + 0x54 bytes   
System.Windows.Forms.dll!System.Windows.Forms.RichTextBox.Rtf.set(string value) + 0x68 bytes    
>CvtCore.dll!CvtCore.StandardFunctions.Str.RtfTextToPlainText(object Expression) Line 494   C#
这是线程816的调用堆栈

[Managed to Native Transition]  
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DefWndProc(ref System.Windows.Forms.Message m) + 0x9e bytes  
System.Windows.Forms.dll!System.Windows.Forms.Control.WmWindowPosChanged(ref System.Windows.Forms.Message m) + 0x39 bytes   
System.Windows.Forms.dll!System.Windows.Forms.Control.WndProc(ref System.Windows.Forms.Message m) + 0x51b bytes 
System.Windows.Forms.dll!System.Windows.Forms.RichTextBox.WndProc(ref System.Windows.Forms.Message m) + 0x5c bytes  
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DebuggableCallback(System.IntPtr hWnd, int msg, System.IntPtr wparam, System.IntPtr lparam) + 0x15e bytes    
[Native to Managed Transition]  
[Managed to Native Transition]  
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DefWndProc(ref System.Windows.Forms.Message m) + 0x9e bytes  
System.Windows.Forms.dll!System.Windows.Forms.Control.WmCreate(ref System.Windows.Forms.Message m) + 0x1c bytes 
System.Windows.Forms.dll!System.Windows.Forms.Control.WndProc(ref System.Windows.Forms.Message m) + 0x50b bytes 
System.Windows.Forms.dll!System.Windows.Forms.RichTextBox.WndProc(ref System.Windows.Forms.Message m) + 0x5c bytes  
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DebuggableCallback(System.IntPtr hWnd, int msg, System.IntPtr wparam, System.IntPtr lparam) + 0x15e bytes    
[Native to Managed Transition]  
[Managed to Native Transition]  
System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.CreateHandle(System.Windows.Forms.CreateParams cp) + 0x44c bytes 
System.Windows.Forms.dll!System.Windows.Forms.Control.CreateHandle() + 0x2b2 bytes  
System.Windows.Forms.dll!System.Windows.Forms.TextBoxBase.CreateHandle() + 0x54 bytes   
System.Windows.Forms.dll!System.Windows.Forms.RichTextBox.Rtf.set(string value) + 0x68 bytes    
>CvtCore.dll!CvtCore.StandardFunctions.Str.RtfTextToPlainText(object Expression) Line 494   C#
为什么任务2在494行阻塞任务4,它们不应该彼此完全独立吗


注意

在发布模式下,我抓取了这些堆栈跟踪和屏幕截图,但我似乎无法在正确的时间点击暂停,以使相同的事情在调试模式下发生。这也可能是我行动迟缓的原因吗?分析器说,我的程序83.2%的时间花在'System.Windows.Forms.RichTextBox.set_Rtf(string)(第494行调用的子函数)上

任何关于如何加快rtf格式条带化过程的建议都将不胜感激


p.S.

我目前正在重写它,这样每个线程都会有一个文本框,不会被丢弃,而不是每次调用函数时都创建一个新的文本框,我希望这样可以大大加快速度,我会在完成后更新详细信息


更新

我解决了自己的问题(见下面的答案),但下面是我如何开始这些任务的

//create start consumer threads
for (int i = 0; i < ThreadsPreProducer; i++)
{
    //create worked and thread
    WorkerObject NewWorkerObject = new WorkerObject(colSource, FormatObjectEvent, UpdateModule);
    Task WorkerTask = new Task(NewWorkerObject.DoWork);
    WorkerTasks.Add(WorkerTask);
    WorkerTask.Start();
}


//create/start producer thread
ProducerObject NewProducerObject = new ProducerObject(colSource, SourceQuery, ConnectionString, PreProcessor, UpdateModule, RowNameIndex);
Task ProducerTask = new Task(NewProducerObject.DoWork);
WorkerTasks.Add(ProducerTask);
ProducerTask.Start();


//block while producer runs
ProducerTask.Wait();

//create post producer threads
for (int i = 0; i < ThreadsPostProducer; i++)
{
    //create worked and thread
    WorkerObject NewWorkerObject = new WorkerObject(colSource, FormatObjectEvent, UpdateModule);
    Task WorkerTask = new Task(NewWorkerObject.DoWork);
    WorkerTasks.Add(WorkerTask);
    WorkerTask.Start();
}

//block until all tasks are done
Task.WaitAll(WorkerTasks.ToArray());
//创建启动使用者线程
对于(int i=0;i

在我的例子中,它使用一个生产者/消费者模型,1个生产者和4个消费者(2个在生产者开始时启动,2个在生产者完成后启动,以在系统资源从生产者中释放后加快工作)。

将功能更改为

static ThreadLocal<RichTextBox> rtfBox = new ThreadLocal<RichTextBox>(() => new RichTextBox());
//convert RTF text to plain text
public static string RtfTextToPlainText(string FormatObject )
{
     rtfBox.Value.Rtf = FormatObject;
     FormatObject = rtfBox.Value.Text;
     rtfBox.Value.Clear();

     return FormatObject;
}
static ThreadLocal rtfBox=new ThreadLocal(()=>new RichTextBox());
//将RTF文本转换为纯文本
公共静态字符串RtfTextToPlainText(字符串格式对象)
{
rtfBox.Value.Rtf=FormatObject;
FormatObject=rtfBox.Value.Text;
rtfBox.Value.Clear();
返回格式化对象;
}
将我的运行时间从几分钟更改为几秒


我不会处理这些对象,因为它们将在程序的整个生命周期中使用。

为了全面了解您的代码,您可以发布如何创建任务吗?@Ramhound我更新了以显示如何启动任务,但我通过将rtf box线程设为本地线程解决了我的问题。对您有好处!很高兴你让它工作了+1用于发布您自己问题的答案,以便其他人受益。