.net 什么会导致这么多未开始的线程?

.net 什么会导致这么多未开始的线程?,.net,multithreading,windbg,.net,Multithreading,Windbg,现在我很奇怪地遇到了一只虫子 我的应用程序是一个winform客户端,需要用WCF连接到服务器。我的应用程序将引用一些.NET和C++模块/DLL。 出于某种原因,我在代码中设置了ThreadPool.SetMaxThreads(150200)。运行数小时后,此客户端将断开与服务器的连接 在使用windbg调试之后,我发现线程池中已经填充了许多奇怪的线程。所以不能在线程池中创建新线程,我认为WCF也不能创建线程来连接导致断开连接的服务器 这些奇怪的线看起来像这样:

现在我很奇怪地遇到了一只虫子

我的应用程序是一个winform客户端,需要用WCF连接到服务器。我的应用程序将引用一些.NET和C++模块/DLL。 出于某种原因,我在代码中设置了
ThreadPool.SetMaxThreads(150200)
。运行数小时后,此客户端将断开与服务器的连接

在使用windbg调试之后,我发现线程池中已经填充了许多奇怪的线程。所以不能在线程池中创建新线程,我认为WCF也不能创建线程来连接导致断开连接的服务器

这些奇怪的线看起来像这样:

                                                                         Lock  
      ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt
XXXX   3  cb8 0043afd8      1400 Preemptive  00000000:00000000 003f3248 0     Ukn 
根据和,生成这些线程的最大概率是CLR在线程池中创建一个新线程,并且该线程永远不会恢复

我想知道一个或多个线程恢复失败的原因或方式

以下是更多的技术细节:

当CLR在线程池中创建新线程时,它将调用
SetupUnstartedThread
方法和
CreateNewThread/CreateNewOSThread
方法

SetupUnstartedThread
之后,CLR将创建如下线程

                                                                         Lock  
      ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt
XXXX   3    0 0043afd8      1400 Preemptive  00000000:00000000 003f3248 0     Ukn 
                                                                         Lock  
      ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt
XXXX   3  cb8 0043afd8      1400 Preemptive  00000000:00000000 003f3248 0     Ukn 
具有
0x1400(TS|u Unstarted | TS|u WeOwn)
状态且没有OSID和调试器ID(XXXX)

CreateNewThread/CreateNewOSThread
之后,线程将变为

                                                                         Lock  
      ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt
XXXX   3  cb8 0043afd8      1400 Preemptive  00000000:00000000 003f3248 0     Ukn 
它有OSID,也没有调试器ID(XXXX)

此外,线程的
ExposedObject
字段为空

但是如果线程成功恢复,这意味着调用了,那么线程将获得调试器ID(2)

线程的状态与错误的状态(没有调试器ID)不同

编辑到Thomas W

如果你提到的选项c是

(c) CLR中可能运行托管代码的特殊OS线程

根据,如果操作系统线程想要访问托管代码,CLR将调用
SetupThread
方法,该方法将运行以下代码

// reset any unstarted bits on the thread object
FastInterlockAnd((ULONG *) &pThread->m_State, ~Thread::TS_Unstarted);
FastInterlockOr((ULONG *) &pThread->m_State, Thread::TS_LegalToJoin);
这肯定不是
0x1400

任何奇怪的线程在
~
线程列表中都没有对应的线程。所以你不能在
中看到它们!失控

编辑2

很抱歉最近更新了这篇文章。尚未找到根本原因,但找到了解决办法,即用.Net Framework 4.5替换.Net Framework 4.0

以下内容将详细介绍如何找到解决方法

曾几何时,我追踪了这些奇怪线索的整个生命周期。我们都知道在CLR中有一个。当我的应用程序开始出错时,Gate线程将调用
clr!ThreadpoolMgr::CreateWorkerThread
周期性,它将创建一个新的clr线程对象和一个新的os线程对象

0:004> k
ChildEBP RetAddr  
04c8f6f8 6f3ea8ff KERNEL32!CreateThreadStub
04c8f744 6f3ea77b clr!Thread::CreateNewOSThread+0xba
04c8f78c 6f3eabc1 clr!Thread::CreateNewThread+0xa9
04c8f81c 6f4a6aed clr!ThreadpoolMgr::CreateUnimpersonatedThread+0xbb
04c8f83c 6f4a560e clr!ThreadpoolMgr::CreateWorkerThread+0x19
04c8f864 6f4a4457 clr!ThreadpoolMgr::EnsureEnoughWorkersWorking+0x116
04c8f94c 75973c45 clr!ThreadpoolMgr::GateThreadStart+0x431
04c8f958 771a37f5 KERNEL32!BaseThreadInitThunk+0xe
04c8f998 771a37c8 ntdll!__RtlUserThreadStart+0x70
04c8f9b0 00000000 ntdll!_RtlUserThreadStart+0x1b
新的线看起来像这样

                                                                         Lock  
      ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt
XXXX   3    0 0043afd8      1400 Preemptive  00000000:00000000 003f3248 0     Ukn 
                                                                         Lock  
      ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt
XXXX   3  cb8 0043afd8      1400 Preemptive  00000000:00000000 003f3248 0     Ukn 
我猜这条线可能永远不会恢复。结果证明我错了。过了一会儿,这个线程调用了
ntdll!LdrInitializeThunk
ntdll_RtlUserThreadStart

0:065> k
ChildEBP RetAddr  
1d54f7c0 75973c45 clr!Thread::intermediateThreadProc
1d54f7cc 771a37f5 KERNEL32!BaseThreadInitThunk+0xe
1d54f80c 771a37c8 ntdll!__RtlUserThreadStart+0x70
1d54f824 00000000 ntdll!_RtlUserThreadStart+0x1b
                                                                         Lock  
      ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt
  65   3  cb8 0043afd8      1400 Preemptive  00000000:00000000 003f3248 0     Ukn 
检查
clr!的参数后!Thread::intermediateThreadProc
,我发现这个线程将调用
clr!ThreadpoolMgr::WorkerThreadStart

然后奇迹发生了

clr之后!ThreadpoolMgr::WorkerThreadStart
已结束,通常为
clr!ThreadStore::RemoveThread
应该在线程死之前由终结器线程调用但这次没有。

没有
clr!ThreadStore::RemoveThread
,只需

0:065> k
ChildEBP RetAddr  
1889fb04 7716f73a ntdll!LdrpCallInitRoutine+0x14
1889fba8 7716f63b ntdll!LdrShutdownThread+0xe6
1889fbb8 75973c4c ntdll!RtlExitUserThread+0x2a
1889fbc4 771a37f5 KERNEL32!BaseThreadInitThunk+0x15
1889fc04 771a37c8 ntdll!__RtlUserThreadStart+0x70
1889fc1c 00000000 ntdll!_RtlUserThreadStart+0x1b
因此,相应的os线程已被销毁,但clr线程也存在

                                                                         Lock  
      ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt
XXXX   3  cb8 0043afd8      1400 Preemptive  00000000:00000000 003f3248 0     Ukn 
也许你会问为什么线程的状态没有改变。由于某种原因,我没有深入到
clr!当时的ThreadpoolMgr::WorkerThreadStart
。所以我不能给你答案,但我又读了一遍,又猜了一下

clr!ThreadpoolMgr::WorkerThreadStart
将调用'clr!SetupThreadPoolThreadNoThrow'。以下是“clr!SetupThreadPoolThreadNoThrow'

EX_TRY
{
    pThread = SetupThreadPoolThread(typeTPThread);
}
EX_CATCH
{
    if (pHR)
    {
        *pHR = GET_EXCEPTION()->GetHR();
    }
}
EX_END_CATCH(SwallowAllExceptions);
请注意“所有例外情况”。然后您可以看到此方法将调用
clr!SetupThreadPoolThread
。再次显示代码段

if (NULL == (pThread = GetThread()))
{
    pThread = SetupInternalThread();
}
if ((pThread != NULL) && ((pThread->m_State & Thread::TS_ThreadPoolThread) == 0))
{

    if (typeTPThread == WorkerThread)
    {
        FastInterlockOr((ULONG *) &pThread->m_State, Thread::TS_ThreadPoolThread | Thread::TS_TPWorkerThread);
    }
    else if (typeTPThread == CompletionPortThread)
    {
        FastInterlockOr ((ULONG *) &pThread->m_State, Thread::TS_ThreadPoolThread | Thread::TS_CompletionPortThread);
    }
    else
    {
        FastInterlockOr((ULONG *) &pThread->m_State, Thread::TS_ThreadPoolThread);
    }
}
然后我猜调用
clr时是否发生异常!SetupInternalThread
,线程的状态将没有机会更改

这是我第一次认为.net framework中可能有一个由我的应用程序触发的小缺陷。与此同时,我的一位同事告诉我,他无法复制这个错误。在检查了他的环境后,我发现他使用了.netframework4.5

到目前为止,在升级.net framework之后,该错误没有再次出现。

SSCCE用于分析线程 要查看.NET如何创建托管线程并将其标记为XXX,可以运行以下代码。在Debug build中编译应用程序,启动WinDbg并在调试器下运行应用程序。在初始断点处,运行以下命令:

sxe -c ".loadby sos clr;g" ld clr.dll;.ocommand OCOMMAND;g
然后,应用程序将自行调试,您将看到线程发生变化

Step                .NET threads  Unstarted  Dead     Thread objects  Native threads
1 (before started)  2             0          0        1               4
2 (Thread started)  3             1 (XXX)    0        2               5
3 (Thread running)  3             0          0        3               8
4 (Thread ended)    3             0          1 (XXX)  2               7
5 (GC ran)          3             0          1 (XXX)  2               4
SSCCE代码:

using System;
using System.Diagnostics;
using System.Threading;

namespace ManagedThreadDebug
{
    class Program
    {
        static void Main()
        {
            InformDebug("Before creating thread object.");

            var t = new Thread(ThreadRun);
            InformDebug("After creating thread object and calling Start().");

            t.Start();
            InformDebug("While thread is running.");

            t.Join();
            InformDebug("After thread was running (GC potentially not run yet).");

            GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
            Thread.Sleep(10);
            GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
            Thread.Sleep(10);
            InformDebug("After thread was running (GC hopefully ran).");
        }

        private static void ThreadRun()
        {
            Thread.Sleep(1000);
        }

        private static void InformDebug(string message)
        {
            Console.WriteLine(message);
            Trace.WriteLine("OCOMMAND .echo >>> "+message+";!threads;.echo;!dumpheap -stat -type Thread;.echo;~;g");
        }
    }
}
几乎完整的输出,为简洁而缩短:

>>> Before creating thread object.
ThreadCount:      2
UnstartedThread:  0
BackgroundThread: 1
PendingThread:    0
DeadThread:       0
                                                                         Lock  
       ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt Exception
   0    1 1074 00441310     2a020 Preemptive  02796F48:00000000 00408378 1     MTA 
   2    2 1fb8 00411258     2b220 Preemptive  00000000:00000000 00408378 0     MTA (Finalizer) 

Statistics:
      MT    Count    TotalSize Class Name
69f02e64        1           52 System.Threading.Thread

.  0  Id: b78.1074 Suspend: 1 Teb: 7efdd000 Unfrozen
   1  Id: b78.2194 Suspend: 1 Teb: 7efda000 Unfrozen
   2  Id: b78.1fb8 Suspend: 1 Teb: 7efd7000 Unfrozen
   3  Id: b78.1500 Suspend: 1 Teb: 7efaf000 Unfrozen

>>> After creating thread object and calling Start().
ThreadCount:      3
UnstartedThread:  1
BackgroundThread: 1
PendingThread:    0
DeadThread:       0
                                                                         Lock  
       ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt Exception
   0    1 1074 00441310     2a020 Preemptive  02797334:00000000 00408378 1     MTA 
   2    2 1fb8 00411258     2b220 Preemptive  00000000:00000000 00408378 0     MTA (Finalizer) 
XXXX    3    0 00474900      1400 Preemptive  00000000:00000000 00408378 0     Ukn 

Statistics:
      MT    Count    TotalSize Class Name
69f02e64        2          104 System.Threading.Thread

.  0  Id: b78.1074 Suspend: 1 Teb: 7efdd000 Unfrozen
   1  Id: b78.2194 Suspend: 1 Teb: 7efda000 Unfrozen
   2  Id: b78.1fb8 Suspend: 1 Teb: 7efd7000 Unfrozen
   3  Id: b78.1500 Suspend: 1 Teb: 7efaf000 Unfrozen
   4  Id: b78.27d8 Suspend: 1 Teb: 7efac000 Unfrozen

>>> While thread is running.
ThreadCount:      3
UnstartedThread:  0
BackgroundThread: 1
PendingThread:    0
DeadThread:       0
                                                                         Lock  
       ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt Exception
   0    1 1074 00441310     2a020 Preemptive  02797550:00000000 00408378 1     MTA 
   2    2 1fb8 00411258     2b220 Preemptive  00000000:00000000 00408378 0     MTA (Finalizer) 
   6    3 1d04 00474900     2b020 Preemptive  00000000:00000000 00408378 1     MTA 

Statistics:
      MT    Count    TotalSize Class Name
69f02e64        2          104 System.Threading.Thread

.  0  Id: b78.1074 Suspend: 1 Teb: 7efdd000 Unfrozen
   1  Id: b78.2194 Suspend: 1 Teb: 7efda000 Unfrozen
   2  Id: b78.1fb8 Suspend: 1 Teb: 7efd7000 Unfrozen
   3  Id: b78.1500 Suspend: 1 Teb: 7efaf000 Unfrozen
   4  Id: b78.27d8 Suspend: 1 Teb: 7efac000 Unfrozen
   5  Id: b78.2478 Suspend: 1 Teb: 7efa9000 Unfrozen
   6  Id: b78.1d04 Suspend: 1 Teb: 7efa6000 Unfrozen
   7  Id: b78.1fdc Suspend: 1 Teb: 7efa3000 Unfrozen

 >>> After thread was running (GC potentially not run yet).
ThreadCount:      3
UnstartedThread:  0
BackgroundThread: 1
PendingThread:    0
DeadThread:       1
                                                                         Lock  
       ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt Exception
   0    1 1074 00441310     2a020 Preemptive  027977FC:00000000 00408378 1     MTA 
   2    2 1fb8 00411258     2b220 Preemptive  00000000:00000000 00408378 0     MTA (Finalizer) 
XXXX    3    0 00474900     39820 Preemptive  00000000:00000000 00408378 0     Ukn 

Statistics:
      MT    Count    TotalSize Class Name
69f02e64        2          104 System.Threading.Thread

.  0  Id: b78.1074 Suspend: 1 Teb: 7efdd000 Unfrozen
   1  Id: b78.2194 Suspend: 1 Teb: 7efda000 Unfrozen
   2  Id: b78.1fb8 Suspend: 1 Teb: 7efd7000 Unfrozen
   3  Id: b78.1500 Suspend: 1 Teb: 7efaf000 Unfrozen
   4  Id: b78.27d8 Suspend: 1 Teb: 7efac000 Unfrozen
   5  Id: b78.2478 Suspend: 1 Teb: 7efa9000 Unfrozen
   7  Id: b78.1fdc Suspend: 1 Teb: 7efa3000 Unfrozen

>>> After thread was running (GC hopefully ran).
ThreadCount:      3
UnstartedThread:  0
BackgroundThread: 1
PendingThread:    0
DeadThread:       1
                                                                         Lock  
       ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt Exception
   0    1 1074 00441310     2a020 Preemptive  02797380:00000000 00408378 1     MTA 
   2    2 1fb8 00411258     2b220 Preemptive  00000000:00000000 00408378 0     MTA (Finalizer) 
XXXX    3    0 00474900     39820 Preemptive  00000000:00000000 00408378 0     Ukn 

Statistics:
      MT    Count    TotalSize Class Name
69f02e64        2          104 System.Threading.Thread

.  0  Id: b78.1074 Suspend: 1 Teb: 7efdd000 Unfrozen
   1  Id: b78.2194 Suspend: 1 Teb: 7efda000 Unfrozen
   2  Id: b78.1fb8 Suspend: 1 Teb: 7efd7000 Unfrozen
   3  Id: b78.1500 Suspend: 1 Teb: 7efaf000 Unfrozen
结论 显示为XXXX的线程可以是未启动线程或死线程。您可能不会喜欢这个答案:除非您向我们展示一些代码,否则无法告诉我们这些线程来自何处。潜在候选人:

  • 代码中的新Thread()语句
  • 使用Parallel.For和类似
  • 线程池的使用
  • 第三方库中的代码
调试线程启动和退出 在WinDbg中运行应用程序,并在线程启动或线程退出时停止

sxe ct;sxe et
然后查看发生这种情况的地方,特别是检查