Linux /proc/pid/stat文件中的cstime错误

Linux /proc/pid/stat文件中的cstime错误,linux,xen,stat,Linux,Xen,Stat,/proc/pid/stat文件中的stime或cstime太大了,没有任何意义。但只是有些进程偶尔会出现错误的cstime。如下所示: # ps -eo pid,ppid,stime,etime,time,%cpu,%mem,command |grep nsc 4815 1 Jan08 1-01:20:02 213503-23:34:33 20226149 0.1 /usr/sbin/nscd # # cat /proc/4815/stat 4815 (nscd) S 1 4815

/proc/pid/stat
文件中的
stime
cstime
太大了,没有任何意义。但只是有些进程偶尔会出现错误的
cstime
。如下所示:

# ps -eo pid,ppid,stime,etime,time,%cpu,%mem,command |grep nsc
4815     1 Jan08  1-01:20:02 213503-23:34:33 20226149  0.1 /usr/sbin/nscd
#
# cat /proc/4815/stat
4815 (nscd) S 1 4815 4815 0 -1 4202560 2904 0 0 0 21 1844674407359 0 0 20 0 9 0 4021 241668096 326 18446744073709551615 139782748139520 139782748261700 140737353849984 140737353844496 139782734487251 0 0 3674112 16390 18446744073709551615 0 0 17 1 0 0 0 0 0
您可以看到proc 4815的
stime
nscd
,是
1844674407359
,等于
213503-23:34:33
,但刚刚运行了
1-01:20:02

另一个问题进程错误的
cstime
如下所示:

一个巴什叉一个什,叉一个睡眠

8155 (bash) S 3124 8155 8155 0 -1 4202752 1277 6738 0 0 3 0 4 1844674407368 20 0 1 0 1738175 13258752 451 18446744073709551615 4194304 4757932 140736528897536 140736528896544 47722675403157 0 65536 4100 65538 18446744071562341410 0 0 17 5 0 0 0 0 0

8184 (sh) S 8155 8155 8155 0 -1 4202496 475 0 0 0 0 0 0 0 20 0 1 0 1738185 11698176 357 18446744073709551615 4194304 4757932 140733266239472 140733266238480 47964680542613 0 65536 4100 65538 18446744071562341410 0 0 17 6 0 0 0 0 0

8185 (sleep) S 8184 8155 8155 0 -1 4202496 261 0 0 0 0 0 0 0 20 0 1 0 1738186 8577024 177 18446744073709551615 4194304 4212204 140734101195248 140734101194776 48002231427168 0 0 4096 0 0 0 0 17 7 0 0 0 0 0
所以您可以看到procbash中的cstime是1844674407368,这远远大于其子级的cpu总时间

我的服务器有一个Intel(R)Xeon(R)CPU E5620@2.40GHz,4核8线程。操作系统是Suse Linux Enterprise Server SP1 x86_64,如下所示

# lsb_release  -a
LSB Version:    core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:desktop-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch
Distributor ID: SUSE LINUX
Description:    SUSE Linux Enterprise Server 11 (x86_64)
Release:    11
Codename:   n/a
#
# uname -a
Linux node2 2.6.32.12-0.7-xen #1 SMP 2010-05-20 11:14:20 +0200 x86_64 x86_64 x86_64 GNU/Linux

那么这是内核的问题吗?有人能帮忙修复吗?

我怀疑您可能只是看到了一个内核错误。更新到最新提供的SLES更新内核(类似于2.6.32.42或更高版本),并查看它是否仍然存在。顺便说一句,它是stime,而不是cstime,它异常高。事实上,仔细观察,你会发现它是一个类似于字符串截断18446744073709551615(2^64-1)±几个时钟偏移的值

pid_nr: 4815
tcomm: (nscd)
state: S
ppid: 1
pgid: 4815
sid: 4815
tty_nr: 0
tty_pgrp: -1
task_flags: 4202560 / 0x402040
min_flt: 2904
cmin_flt: 0
max_flt: 0
cmax_flt: 0
utime: 21 clocks (= 21 clocks) (= 0.210000 s)
stime: 1844674407359 clocks (= 1844674407359 clocks) (= 18446744073.590000 s)
cutime: 0 clocks (= 0 clocks) (= 0.000000 s)
cstime: 0 clocks (= 0 clocks) (= 0.000000 s)
priority: 20
nice: 0
num_threads: 9
always-zero: 0
start_time: 4021
vsize: 241668096
get_mm_rss: 326
rsslim: 18446744073709551615 / 0xffffffffffffffff
mm_start_code: 139782748139520 / 0x7f21b50c7000
mm_end_code: 139782748261700 / 0x7f21b50e4d44
mm_start_stack: 140737353849984 / 0x7ffff7fb9c80
esp: 140737353844496 / 0x7ffff7fb8710
eip: 139782734487251 / 0x7f21b43c1ed3
obsolete-pending-signals: 0 / 0x0
obsolete-blocked-signals: 0 / 0x0
obsolete-sigign: 3674112 / 0x381000
obsolete-sigcatch: 16390 / 0x4006
wchan: 18446744073709551615 / 0xffffffffffffffff
always-zero: 0
always-zero: 0
task_exit_signal: 17
task_cpu: 1
task_rt_priority: 0
task_policy: 0
delayacct_blkio_ticks: 0
gtime: 0 clocks (= 0 clocks) (= 0.000000 s)
cgtime: 0 clocks (= 0 clocks) (= 0.000000 s)

谢谢@jørgensen是的,前一个进程4815的1844674407359是stime,但我也注意到cstime异常高。