Rabbitmq 什么会导致“whereis(user)”返回未定义?

Rabbitmq 什么会导致“whereis(user)”返回未定义?,rabbitmq,erlang,Rabbitmq,Erlang,我们正在生产系统上运行RabbitMQ 3.6.5,在 Erlang R16B03-1 (erts-5.10.4) [source] [64-bit] [smp:2:2] [async-threads:10] [hipe] [kernel-poll:false] rabbitmqctl rotate_日志出现以下故障: Erlang VM I/O system is damaged, restart needed 查看该版本的代码,此错误仅存在于rabbit\u log.erl: ...

我们正在生产系统上运行RabbitMQ 3.6.5,在

Erlang R16B03-1 (erts-5.10.4) [source] [64-bit] [smp:2:2] [async-threads:10] [hipe] [kernel-poll:false]
rabbitmqctl rotate_日志
出现以下故障:

Erlang VM I/O system is damaged, restart needed
查看该版本的代码,此错误仅存在于
rabbit\u log.erl

...

%% Execute Fun using the IO system of the local node (i.e. the node on
%% which the code is executing). Since this is invoked for every log
%% message, we try to avoid unnecessarily churning group_leader/1.
with_local_io(Fun) ->
    GL = group_leader(),
    Node = node(),
    case node(GL) of
        Node -> Fun();
        _    -> set_group_leader_to_user_safely(whereis(user)),
                try
                    Fun()
                after
                    group_leader(GL, self())
                end
    end.

set_group_leader_to_user_safely(undefined) ->
    handle_damaged_io_system();
set_group_leader_to_user_safely(User) when is_pid(User) ->
    group_leader(User, self()).

handle_damaged_io_system() ->
    Msg = "Erlang VM I/O system is damaged, restart needed~n",
    io:format(standard_error, Msg, []),
    exit(erlang_vm_restart_needed).
因此,似乎
whereis(user)
正在返回
undefined

这个错误在运行相同版本的测试系统中没有发生,这意味着到目前为止我无法重现这个错误 在非生产系统中,我可以尝试纠正措施。由于故障发生在生产系统中,因此关键目标是 以尽可能少的中断方式更正此问题

我希望了解这种情况是否会阻止兔子再次启动,以及 重新启动RabbitMQ将更正此问题

日志
启动\u err
包含:

Erlang VM I/O system is damaged, restart needed
=SUPERVISOR REPORT==== 28-Nov-2018::18:45:04 ===
     Supervisor: {<0.26938.6080>,rabbit_channel_sup}
     Context:    shutdown_error
     Reason:     noproc
     Offender:   [{pid,<0.26217.6080>},
                  {name,channel},
                  {mfargs,
                      {rabbit_channel,start_link,
                          [1,<0.27525.6080>,<0.25586.6080>,<0.27525.6080>,
                           <<"40.113.233.192:3979 -> 10.0.0.4:5672">>,
                           rabbit_framing_amqp_0_9_1,
                           {user,<<"FDLMessaging">>,[],
                               [{rabbit_auth_backend_internal,none}]},
                           <<"/">>,
                           [{<<"publisher_confirms">>,bool,true},
                            {<<"exchange_exchange_bindings">>,bool,true},
                            {<<"basic.nack">>,bool,true},
                            {<<"consumer_cancel_notify">>,bool,true},
                            {<<"connection.blocked">>,bool,true},
                            {<<"authentication_failure_close">>,bool,true}],
                           <0.26295.6080>,<0.22812.6080>]}},
                  {restart_type,intrinsic},
                  {shutdown,70000},
                  {child_type,worker}]
日志
rabbit@fdlquevm-sasl.log-20180710包含:

Erlang VM I/O system is damaged, restart needed
=SUPERVISOR REPORT==== 28-Nov-2018::18:45:04 ===
     Supervisor: {<0.26938.6080>,rabbit_channel_sup}
     Context:    shutdown_error
     Reason:     noproc
     Offender:   [{pid,<0.26217.6080>},
                  {name,channel},
                  {mfargs,
                      {rabbit_channel,start_link,
                          [1,<0.27525.6080>,<0.25586.6080>,<0.27525.6080>,
                           <<"40.113.233.192:3979 -> 10.0.0.4:5672">>,
                           rabbit_framing_amqp_0_9_1,
                           {user,<<"FDLMessaging">>,[],
                               [{rabbit_auth_backend_internal,none}]},
                           <<"/">>,
                           [{<<"publisher_confirms">>,bool,true},
                            {<<"exchange_exchange_bindings">>,bool,true},
                            {<<"basic.nack">>,bool,true},
                            {<<"consumer_cancel_notify">>,bool,true},
                            {<<"connection.blocked">>,bool,true},
                            {<<"authentication_failure_close">>,bool,true}],
                           <0.26295.6080>,<0.22812.6080>]}},
                  {restart_type,intrinsic},
                  {shutdown,70000},
                  {child_type,worker}]
=主管报告===2018年11月28日::18:45:04===
主管:{,兔子频道}
上下文:关机错误
原因:noproc
罪犯:[{pid,},
{name,channel},
{mfargs,
{兔子频道,启动链接,
[1,,,,
>,
rabbit_framing_amqp_0_9_1,
{用户,,[],
[{rabbit_auth_backend_internal,none}]},
,
[{,bool,true},
{,bool,true},
{,bool,true},
{,bool,true},
{,bool,true},
{,bool,true}],
,]}},
{restart_type,injective},
{关闭,70000},
{child_type,worker}]

什么会导致
whereis(user)
返回未定义?

用户
进程是一个中央进程,处理与标准输出相关的大部分I/O。默认情况下,它将始终存在,除非将
-nouser
参数提供给底层Erlang虚拟机

这个进程可能失败的另一种方式是,如果它被另一个恶意进程杀死,或者被VM执行类似杀死一个使用过多内存的进程的操作,或者因为它引用了热代码加载过多次的代码,需要清除它


除非有人或某事愿意拨弄该代码,否则基本上不会发生这种情况。

您使用的是什么版本的Erlang?RabbitMQ日志中有什么有趣的内容吗?解释
用户
过程是什么以及它的作用。请注意,RabbitMQ
3.7.X
版本没有此代码,因为日志记录切换到了
lager
库。还有一件事,在RabbitMQ核心工程团队监控时,最好询问这类问题。@LukeBakken-在问题中添加了Erlang版本。也谢谢你给rabbitmq用户的链接。@LukeBakken-现在问题也包括日志信息。