如何在Ruby中跟踪死锁_Ruby_Multithreading_Locking_Deadlock_Backtrace

如何在Ruby中跟踪死锁

ruby multithreading

如何在Ruby中跟踪死锁,ruby,multithreading,locking,deadlock,backtrace,Ruby,Multithreading,Locking,Deadlock,Backtrace,在Ruby 1.9中，我使用fork与Process#fork共享各种工作进程的数据源： Thread.abort_on_exception = true fork do puts "Initializing data source process... (PID: #{Process.pid})" data = DataSource.new(files) BrB::Service.start_service(:object => data, :verbose =>

在Ruby 1.9中，我使用fork与

Process#fork

共享各种工作进程的数据源：

Thread.abort_on_exception = true

fork do
  puts "Initializing data source process... (PID: #{Process.pid})"
  data = DataSource.new(files)

  BrB::Service.start_service(:object => data, :verbose => false, :host => host, :port => port)
  EM.reactor_thread.join
end

工人分为以下几类：

8.times do |t|  
  fork do
    data = BrB::Tunnel.create(nil, "brb://#{host}:#{port}", :verbose => false)

    puts "Launching #{threads_num} worker threads... (PID: #{Process.pid})"    

    threads = []
    threads_num.times { |i|
      threads &lt;&lt; Thread.new {
        while true
          begin
            worker = Worker.new(data, config)

          rescue OutOfTargetsError
            break

          rescue Exception => e
            puts "An unexpected exception was caught: #{e.class} => #{e}"
            sleep 5

          end
        end
      }
    }
    threads.each { |t| t.join }

    data.stop_service
    EM.stop
  end
end

这工作得非常完美，但在运行大约10分钟后，我出现以下错误：

bootstrap.rb:47:in `join': deadlock detected (fatal)
    from bootstrap.rb:47:in `block in <main>'
    from bootstrap.rb:39:in `fork'
    from bootstrap.rb:39:in `<main>'</pre>

bootstrap.rb:47:在'join'中：检测到死锁（致命）
来自bootstrap.rb:47:in'block in'
来自bootstrap.rb:39:in'fork'
从bootstrap.rb:39:in`'

这个错误并没有告诉我死锁实际发生在哪里，它只将我指向EventMachine线程上的

join

如何追溯程序锁定的时间点？

它锁定在父线程的

join

上，该信息是准确的。要跟踪它在子线程中锁定的位置，请尝试将线程的工作包装在一个文件中。您需要临时删除catch all

rescue

，以引发超时异常

当前，父线程尝试按顺序连接所有线程，直到每个线程完成为止。但是，每个线程将只在一个

outofTargetError

上连接。通过使用短期线程并将

while

循环移动到父循环中，可以避免死锁。没有保证，但也许像这样的事情会奏效

8.times do |t|  
  fork do
    running = true
    Signal.trap("INT") do
      puts "Interrupt signal received, waiting for threads to finish..."
      running = false
    end

    data = BrB::Tunnel.create(nil, "brb://#{host}:#{port}", :verbose => false)

    puts "Launching max #{threads_num} worker threads... (PID: #{Process.pid})"    

    threads = []
    while running
      # Start new threads until we have threads_num running
      until threads.length >= threads_num do
        threads << Thread.new {
          begin
            worker = Worker.new(data, config)
          rescue OutOfTargetsError
          rescue Exception => e
            puts "An unexpected exception was caught: #{e.class} => #{e}"
            sleep 5
          end
        }
      end

      # Make sure the parent process doesn't spin too much
      sleep 1

      # Join finished threads
      finished_threads = threads.reject &:status
      threads -= finished_threads
      finished_threads.each &:join
    end

    data.stop_service
    EM.stop
  end
end

8.0倍做| t |
叉子
运行=真
信号陷阱（“INT”）do
放置“接收到中断信号，等待线程完成…”
运行=错误
结束
data=BrB:：Tunnel.create（nil，“BrB://{host}:{port}”，：verbose=>false）
放置“启动最大#{threads_num}工作线程…”（PID:#{Process.PID}）
线程=[]
跑步时
#启动新线程，直到我们有线程运行
直到threads.length>=threads\u num do
螺纹e
puts“捕获到意外异常：#{e.class}=>#{e}”
睡眠5
结束
}
结束
#确保父进程不会旋转太多
睡眠1
#连接完成的螺纹
已完成的线程=线程。拒绝&：状态
螺纹-=成品螺纹
已完成的线程。每个&：连接
结束
data.stop_服务
住手
结束
结束

我遇到了同样的问题，并通过使用以下代码片段解决了它：

# Wait for all threads (other than the current thread and
# main thread) to stop running.
# Assumes that no new threads are started while waiting
def join_all
  main     = Thread.main       # The main thread
  current  = Thread.current    # The current thread
  all      = Thread.list       # All threads still running
  # Now call join on each thread
  all.each{|t| t.join unless t == current or t == main }
end

来源：Ruby编程语言，O'Reilly（2008）

嘿，伙计，这种方法有什么好运气吗？你有没有试过把

线程放在块结束之前退出？