Linux 程序终止不会返回bash

Linux 程序终止不会返回bash,linux,bash,process,Linux,Bash,Process,最近,我问了一个问题,但由于男生们认为这不清楚,所以我的票数下降了。然而,我发现了一个需要挖掘的提示 有一个名为fluent的命令行程序。问题是,在Rocks中,当我在前端运行它并输入exit时,它将返回到命令提示符 5991 nodes, binary. 5991 node flags, binary. Done. > exit mahmood@cluster:~$ 5991 nodes, binary. 5991 node flags, binary

最近,我问了一个问题,但由于男生们认为这不清楚,所以我的票数下降了。然而,我发现了一个需要挖掘的提示

有一个名为
fluent
的命令行程序。问题是,在Rocks中,当我在前端运行它并输入
exit
时,它将返回到命令提示符

    5991 nodes, binary.
    5991 node flags, binary.
Done.

> exit
mahmood@cluster:~$
    5991 nodes, binary.
    5991 node flags, binary.
Done.

> exit
^C^C^Z
[1]+  Stopped                 /share/apps/fluent/bin/fluent 3d -g -t4 -i elbow.journal
mahmood@compute-0-3:~$ pkill fluent*
mahmood@compute-0-3:~$ fg
/share/apps/fluent/bin/fluent 3d -g -t4 -i elbow.journal
Terminated
但是,当我通过
ssh
在计算节点上运行相同的命令(应用程序位于NFS驱动器上)时,它不会返回到命令提示符

    5991 nodes, binary.
    5991 node flags, binary.
Done.

> exit
mahmood@cluster:~$
    5991 nodes, binary.
    5991 node flags, binary.
Done.

> exit
^C^C^Z
[1]+  Stopped                 /share/apps/fluent/bin/fluent 3d -g -t4 -i elbow.journal
mahmood@compute-0-3:~$ pkill fluent*
mahmood@compute-0-3:~$ fg
/share/apps/fluent/bin/fluent 3d -g -t4 -i elbow.journal
Terminated
正如建议的那样,我尝试了
strace
,并多次将其附加到进程中,因为应用程序在多核上运行。在一次尝试中,应用程序返回到终端。我注意到在
strace
的最后几行中,
futex
的结果之间存在差异

在正确的执行中,我看到:

    socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 12
    setsockopt(12, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
    setsockopt(12, SOL_TCP, TCP_NODELAY, [1], 4) = 0
    setsockopt(12, SOL_SOCKET, SO_SNDBUF, [65536], 4) = 0
    setsockopt(12, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0
    fcntl(12, F_GETFL)                      = 0x2 (flags O_RDWR)
    fcntl(12, F_SETFL, O_RDWR)              = 0
    connect(12, {sa_family=AF_INET, sin_port=htons(45470), sin_addr=inet_addr("10.10.10.251")}, 16) = 0
    write(12, "12345\0", 6)                 = 6
    write(12, "15  NORMAL_EXITING\0", 19)   = 19
    read(12, "\0", 1)                       = 1
    close(12)                               = 0
    futex(0x2b66afe5d9d0, FUTEX_WAIT, 12432, NULL) = 0
    futex(0x2b66afc5c9d0, FUTEX_WAIT, 12427, NULL) = 0
    close(6)                                = 0
    close(7)                                = 0
    close(8)                                = 0
    close(9)                                = 0
    close(10)                               = 0
    shmdt(0x2b66af7d8000)                   = 0
    shmdt(0x2b66b0018000)                   = 0
    shmdt(0x2b66af3a8000)                   = 0
    shmdt(0x2b66af638000)                   = 0
    shmdt(0x2b66af758000)                   = 0
    shmdt(0x2b66aff78000)                   = 0
    shmdt(0x2b66af6d8000)                   = 0
    shmdt(0x2b66afed8000)                   = 0
    close(4)                                = 0
    close(5)                                = 0
    exit_group(0)                           = ?
    Process 12420 detached
在马车上,我看到:

    socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 9
    setsockopt(9, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
    setsockopt(9, SOL_TCP, TCP_NODELAY, [1], 4) = 0
    setsockopt(9, SOL_SOCKET, SO_SNDBUF, [65536], 4) = 0
    setsockopt(9, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0
    fcntl(9, F_GETFL)                       = 0x2 (flags O_RDWR)
    fcntl(9, F_SETFL, O_RDWR)               = 0
    connect(9, {sa_family=AF_INET, sin_port=htons(50825), sin_addr=inet_addr("10.10.10.251")}, 16) = 0
    write(9, "12345\0", 6)                  = 6
    write(9, "15  NORMAL_EXITING\0", 19)    = 19
    read(9, "\0", 1)                        = 1
    close(9)                                = 0
    futex(0x2b74f03659d0, FUTEX_WAIT, 13135, NULL) = -1 EAGAIN (Resource temporarily unavailable)
    futex(0x2b74f01649d0, FUTEX_WAIT, 13132, NULL) = 0
    close(6)                                = 0
    close(7)                                = 0
    shmdt(0x2b74efce0000)                   = 0
    shmdt(0x2b74f03e0000)                   = 0
    shmdt(0x2b74efbe0000)                   = 0
    shmdt(0x2b74f0480000)                   = 0
    shmdt(0x2b74ef8b0000)                   = 0
    shmdt(0x2b74efb40000)                   = 0
    shmdt(0x2b74efc60000)                   = 0
    shmdt(0x2b74f0520000)                   = 0
    close(4)                                = 0
    close(5)                                = 0
    exit_group(0)                           = ?
    Process 13129 detached
正如您所看到的,尽管它们都表示退出组(0),但后者表示资源暂时不可用


对此有何想法?

第二个示例似乎没有关闭所有套接字,并且可能正在锁定进程(注意
close(n)中的差异)
。也许可以尝试在
屏幕上运行进程
会话…我发现这个主题与我的主题相关。答案中有一个超时参数,但我不知道如何设置。是否可以为此更改内核参数?我没有访问二进制文件源代码的权限