Linux 如果设备是ext4,如何从vm.block_dump=1的dmesg输出结果将块号映射到文件名?

Linux 如果设备是ext4,如何从vm.block_dump=1的dmesg输出结果将块号映射到文件名?,linux,ext4,sysctl,dmesg,Linux,Ext4,Sysctl,Dmesg,tl;dr:当vm.block\u dump=1时,我想查看/获取dmesg上报告的块号所属的文件名。示例dmesg:bash(13515):读取xvda3上的块5434824(32个扇区) 当例如sudo sysctl-w vm.block_dump=1或例如echo'1'| sudo tee/proc/sys/vm/block_dump时,“Linux报告所有发生的磁盘读写操作,以及对文件执行的所有块指令。[…]block_dump的输出被写入内核输出,并且可以使用“dmesg”进行检索“。

tl;dr:
vm.block\u dump=1时,我想查看/获取dmesg上报告的块号所属的文件名。示例dmesg:
bash(13515):读取xvda3上的块5434824(32个扇区)

当例如
sudo sysctl-w vm.block_dump=1
或例如
echo'1'| sudo tee/proc/sys/vm/block_dump
时,“Linux报告所有发生的磁盘读写操作,以及对文件执行的所有块指令。[…]block_dump的输出被写入内核输出,并且可以使用“dmesg”进行检索“。当您使用block_dump并且您的内核日志记录级别还包括内核调试消息时,您可能希望关闭klogd,否则将记录block_dump的输出,从而导致通常不存在的磁盘活动。”(引用自)

“块状污染”不是一个问题,例如

[ 3140.559675] systemd-journal(291): dirtied inode 399135 (system.journal) on xvda3
我可以看到它的文件名如下:

$ echo -e 'open /dev/xvda3\n ncheck 399135' | sudo debugfs -f -
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs:  ncheck 399135
Inode   Pathname
399135  /var/log/journal/12c5e521101c444594b96b53751551a8/system.journal
问题在于“Linux报告发生的所有磁盘读写操作”(引自上文),因为它们是以块的形式报告的,例如

[ 3140.376827] kworker/u24:3(21616): WRITE block 11037768 on xvda3 (8 sectors)
[ 3140.724725] bash(13515): READ block 5434824 on xvda3 (32 sectors)
[ 3140.725483] date(13515): READ block 5434896 on xvda3 (160 sectors)
[ 3140.728946] sed(13519): READ block 5143680 on xvda3 (32 sectors)
[ 3140.736022] sleep(13522): READ block 5379184 on xvda3 (24 sectors)
[ 3140.804803] qubes-gui(522): READ block 5179952 on xvda3 (16 sectors)
[ 3140.806519] Xorg(599): READ block 7420192 on xvda3 (176 sectors)
[ 3140.810348] InputThread(613): READ block 7418560 on xvda3 (112 sectors)
[ 3140.815866] at-spi2-registr(812): READ block 5654512 on xvda3 (8 sectors)
[ 3140.816860] xdg-desktop-por(888): READ block 5795168 on xvda3 (8 sectors)
[ 3140.818716] gnome-terminal-(865): READ block 5804672 on xvda3 (16 sectors)
[ 3141.064524] sed(13531): READ block 3446048 on xvda3 (16 sectors)
[ 3141.130808] systemd(571): READ block 4744136 on xvda3 (184 sectors)
这里可以看到负责显示此类消息的内核代码:

这些块都不会产生任何索引节点号,使用以下方法:

$ echo -e 'open /dev/xvda3\n icheck 11037768' |sudo debugfs -f -
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs:  icheck 11037768
Block   Inode number
11037768    <block not found>
另一种方式:

$ filefrag -s -v /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py
Filesystem type is: ef53
File size of /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py is 22179 (6 blocks of 4096 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
   0:        0..       5:    2172716..   2172721:      6:             last,eof
/usr/lib/python2.7/site-packages/salt/modules/zonecfg.py: 1 extent found
现在的问题仍然是:这些设备块号(dmesg中为xvda3报告的块号)与这些物理偏移量有什么关系

EDIT2:我刚刚确认,这些物理偏移量编号与块设备编号相同(只是,显然与dmesg上报告的编号不同);下面显示了上述文件的最后一个块,我可以确认它与使用
vim查看文件时相同:

$ sudo dd bs=4096 skip=2172721 count=1 if=/dev/xvda3 | hexdump -C

我已经在Qubes OS R4.0 Fedora 28 AppVM中的4.18.5内核下对此进行了测试。(如果需要,我可以使用自定义
.config
/补丁重新编译自定义内核-欢迎建议)当
vm.block_dump=1时,dmesg上报告的块号是扇区号-
512
每个块的字节数-
debugfs
命令对其
icheck
命令所期望的块号通常是ext4的
4096
每个块的字节数,因此所需的只是一个除以
8

虽然块大小通常为4096字节,但您必须确保,因此块大小如下所示:

$ sudo blockdev --getbsz /dev/xvda3
4096
虽然此值通常为4096,但在某些情况下(例如,
mkfs),它可能会有所不同。ext4
with
-b
可以创建具有不同块大小的ext4文件系统;vfat具有512字节的块大小

要从块大小一次完成(在bash中)获取分割的分割器,请执行以下操作:

让我们使用这一行作为输入:
[3140.736022]睡眠(13522):读取xvda3上的块5379184(24个扇区)

通过将上述值除以
8
,获取
icheck
的块号:

$ bc -l
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'. 
5379184/8
672398.00000000000000000000
$ sudo debugfs
debugfs 1.44.2 (14-May-2018)
debugfs:  open /dev/xvda3
debugfs:  icheck 672398
Block   Inode number
672398  134571
debugfs:  ncheck 134571
Inode   Pathname
134571  /usr/bin/sleep
debugfs:  close
debugfs:  quit
我使用了
bc-l
来确保我没有键入数字(如果它以
.00000000000000
结尾,我很可能没有)

使用
debugfs
获取路径名:

$ bc -l
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'. 
5379184/8
672398.00000000000000000000
$ sudo debugfs
debugfs 1.44.2 (14-May-2018)
debugfs:  open /dev/xvda3
debugfs:  icheck 672398
Block   Inode number
672398  134571
debugfs:  ncheck 134571
Inode   Pathname
134571  /usr/bin/sleep
debugfs:  close
debugfs:  quit
我不确定dmesg中报告的块数是否可以是每个扇区512字节以外的值的倍数。例如,如果底层磁盘有,它们是否仍然是每个块512字节?如果我要猜测,我会说,假设它们始终是每个块512字节是安全的

以下是上述脚本的版本:

#!/bin/bash

#./showblock rev.02 rewritten for question/answer from: https://stackoverflow.com/q/52058914/10239615

#----
bytes_per_sector=512 #assumed that dmesg block numbers are 512 bytes each (ie. 512 bytes per sector; aka block size is 512)!
#----

#use `sudo` only when not already root
if test "`id -u`" != "0"; then
    sudo='sudo'
else
    sudo=''
fi

if ! test "$#" -ge "2"; then
  echo "Usage: $0 <device> <dmesgblocknumber> [dmesgblocknumber ...]"
  echo "Examples:"
  echo "$0 /dev/xvda3 5379184"
  echo "$0 /dev/xvda3 5379184 5129952 7420192"
  exit 1
fi

within_exit() {
  echo -e "\nSkipped current instruction within on_exit()"
}
on_exit() {
  #trap - EXIT SIGINT SIGQUIT SIGHUP  #will exit by skipping the rest of all instrunction from on_exit() eg. if C-c
  trap within_exit EXIT SIGINT SIGQUIT SIGHUP #skip only current instruction from on_exit() eg. when C-c is pressed
  #echo "first sleep"
  #sleep 10
  #echo "second sleep"
  #sleep 10
  if test "${#remaining_args[@]}" -gt 0; then
    echo -n "WARNING: There are '${#remaining_args[@]}' remaining args not processed, they are: " >&2
    for i in `seq 0 1 "$(( "${#remaining_args[@]}" - 1 ))"`; do  #seq is part of coreutils package
      echo -n "'${remaining_args[${i}]}' " >&2
    done
    echo >&2
  fi
}

trap on_exit EXIT SIGINT SIGQUIT SIGHUP

dev="$1"
shift 1

if test -z "$dev" -o ! -b "$dev"; then
  echo "Bad device name or not a device: '$dev'" >&2
  exit 1
fi

blocksize="`$sudo blockdev --getbsz "$dev"`"
if test "${blocksize:-0}" -le "0"; then #handles empty arg too
  echo "Failed getting block size for '$dev', got '$blocksize'" >&2
  exit 1
fi
#TODO: check and fail if not a multiplier
divider="$(( $blocksize / $bytes_per_sector ))"
if ! test "${divider:-0}" -gt "0"; then
  echo "Failed computing divider from: '$blocksize' / '$bytes_per_sector', got '$divider'" >&2
  exit 1
fi

# for each passed-in dmesg block number do
while test "$#" -gt "0"; do
dmesgblock="$1"
shift
remaining_args=("$@") #for on_exit() above
echo '--------'
echo "Passed-in dmesg block($bytes_per_sector bytes per block) number: '$dmesgblock'"
#have to handle the case when $dmesgblock is empty and when it's negative eg. "-1" so using a default value(of 0) when unset in the below 'if' block will help not break the 'test' expecting an integer while also allowing negative numbers ("0$dmesgblock" would otherwise yield "0-1" a non-integer):
if test "${dmesgblock:-0}" -le "0"; then
  echo "Bad passed-in dmesg block number: '$dmesgblock'" >&2
  exit 1
fi

#TODO: check and fail(or warn? nah, it should be fail!) if not a multiplier (eg. use modullo? is it "%" ?)
block=$(( $dmesgblock / 8 ))
if ! test "${block:--1}" -ge "0"; then
  echo "Badly computed device block number: '$block'" >&2
  exit 1
fi

echo "Actual block number(of $blocksize bytes per block): $block"
inode="$(echo "open ${dev}"$'\n'"icheck ${block}"$'\n'"close" | $sudo debugfs -f - 2>/dev/null | tail -n2|head -1|cut -f2 -d$'\t')"
if test "<block not found>" == "$inode"; then
  echo "No inode was found for the provided dmesg block number '$dmesgblock' which mapped to dev block number '$block'" >&2
  exit 1
else
    #assuming number TODO: check for this!
    echo "Found inode: $inode"
    fpath="$(echo "open ${dev}"$'\n'"ncheck ${inode}"$'\n'"close" | $sudo debugfs -f - 2>/dev/null | tail -n2|head -1|cut -f2- -d$'\t')"
  #fpath always begins with '/' right?
    if test "$fpath" != "Pathname"; then
        echo "Found path : $fpath"
    else
        echo "not found"
    fi
fi
done
以下是处理整个dmesg日志的脚本:

#!/bin/bash

#./showallblocks rev.01 rewritten for question/answer from: https://stackoverflow.com/q/52058914/10239615

if test "`id -u`" != "0"; then
    sudo='sudo'
else
    sudo=''
fi

dmesglog="$1"
if test -z "$dmesglog"; then
  echo "Usage: '$0' <dmesglogfile>"
  echo "Examples:"
  echo "sudo dmesg > dmesg1.log && '$0' dmesg1.log"
  echo "'$0' <(sudo dmesg)"
  #Note: '$0' used for the case when $0 has spaces or other things in its path names, and user wants to copy paste, for whatever reason, the output of the above into the command line.
  exit 1
fi

#(optional) Stop logging if already in progress:
$sudo sysctl -w vm.block_dump=0

#Using the answer from here(thanks to glenn jackman): https://unix.stackexchange.com/a/467377/306023
#grep --color=never -E -- 'READ block [0-9]+ on xvda3' "$dmesglog" |
#cat "$dmesglog" |
$sudo perl -pe '
if (! /READ block [0-9]+ on [A-Za-z0-9]+ .*$/) {
  s{.*}{}s
}

s{(READ block) (\d+) (on) ([A-Za-z0-9]+) ([^\$]*)\n$}
{join " ",$1, $2, $3, $4, $5, qx(./showblock "/dev/$4" "$2" | grep -F -- "Found path :" | cut -f4- -d" ")}es
' -- "$dmesglog"
#Note: the output of qx(...) above is purposely allowed to have trailing newline! (I did wonder if purposely is correct here or it should be purposefully, https://www.merriam-webster.com/words-at-play/purposely-purposefully-usage )
#To find out what "}es"(above) is, see perlre modifiers: https://perldoc.perl.org/perlre.html#Modifiers
#FIXME: noobish try to exclude lines not matching the lines that need to be replaced, from output! used 'if' above

vm.block_dump=1
时,dmesg上报告的块号是扇区号-
512
每个块的字节数-
debugfs
命令对其
icheck
命令所期望的块号通常是ext4的
4096
每个块的字节数,因此所需要的只是用
8

虽然块大小通常为4096字节,但您必须确保,因此块大小如下所示:

$ sudo blockdev --getbsz /dev/xvda3
4096
虽然此值通常为4096,但在某些情况下(例如,
mkfs),它可能会有所不同。ext4
with
-b
可以创建具有不同块大小的ext4文件系统;vfat具有512字节的块大小

要从块大小一次完成(在bash中)获取分割的分割器,请执行以下操作:

让我们使用这一行作为输入:
[3140.736022]睡眠(13522):读取xvda3上的块5379184(24个扇区)

通过将上述值除以
8
,获取
icheck
的块号:

$ bc -l
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'. 
5379184/8
672398.00000000000000000000
$ sudo debugfs
debugfs 1.44.2 (14-May-2018)
debugfs:  open /dev/xvda3
debugfs:  icheck 672398
Block   Inode number
672398  134571
debugfs:  ncheck 134571
Inode   Pathname
134571  /usr/bin/sleep
debugfs:  close
debugfs:  quit
我使用了
bc-l
来确保我没有键入数字(如果它以
.00000000000000
结尾,我很可能没有)

使用
debugfs
获取路径名:

$ bc -l
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'. 
5379184/8
672398.00000000000000000000
$ sudo debugfs
debugfs 1.44.2 (14-May-2018)
debugfs:  open /dev/xvda3
debugfs:  icheck 672398
Block   Inode number
672398  134571
debugfs:  ncheck 134571
Inode   Pathname
134571  /usr/bin/sleep
debugfs:  close
debugfs:  quit
我不确定dmesg中报告的块数是否可以是每个扇区512字节以外的值的倍数。例如,如果底层磁盘有,它们是否仍然是每个块512字节?如果我要猜测,我会说,假设它们始终是每个块512字节是安全的

以下是上述脚本的版本:

#!/bin/bash

#./showblock rev.02 rewritten for question/answer from: https://stackoverflow.com/q/52058914/10239615

#----
bytes_per_sector=512 #assumed that dmesg block numbers are 512 bytes each (ie. 512 bytes per sector; aka block size is 512)!
#----

#use `sudo` only when not already root
if test "`id -u`" != "0"; then
    sudo='sudo'
else
    sudo=''
fi

if ! test "$#" -ge "2"; then
  echo "Usage: $0 <device> <dmesgblocknumber> [dmesgblocknumber ...]"
  echo "Examples:"
  echo "$0 /dev/xvda3 5379184"
  echo "$0 /dev/xvda3 5379184 5129952 7420192"
  exit 1
fi

within_exit() {
  echo -e "\nSkipped current instruction within on_exit()"
}
on_exit() {
  #trap - EXIT SIGINT SIGQUIT SIGHUP  #will exit by skipping the rest of all instrunction from on_exit() eg. if C-c
  trap within_exit EXIT SIGINT SIGQUIT SIGHUP #skip only current instruction from on_exit() eg. when C-c is pressed
  #echo "first sleep"
  #sleep 10
  #echo "second sleep"
  #sleep 10
  if test "${#remaining_args[@]}" -gt 0; then
    echo -n "WARNING: There are '${#remaining_args[@]}' remaining args not processed, they are: " >&2
    for i in `seq 0 1 "$(( "${#remaining_args[@]}" - 1 ))"`; do  #seq is part of coreutils package
      echo -n "'${remaining_args[${i}]}' " >&2
    done
    echo >&2
  fi
}

trap on_exit EXIT SIGINT SIGQUIT SIGHUP

dev="$1"
shift 1

if test -z "$dev" -o ! -b "$dev"; then
  echo "Bad device name or not a device: '$dev'" >&2
  exit 1
fi

blocksize="`$sudo blockdev --getbsz "$dev"`"
if test "${blocksize:-0}" -le "0"; then #handles empty arg too
  echo "Failed getting block size for '$dev', got '$blocksize'" >&2
  exit 1
fi
#TODO: check and fail if not a multiplier
divider="$(( $blocksize / $bytes_per_sector ))"
if ! test "${divider:-0}" -gt "0"; then
  echo "Failed computing divider from: '$blocksize' / '$bytes_per_sector', got '$divider'" >&2
  exit 1
fi

# for each passed-in dmesg block number do
while test "$#" -gt "0"; do
dmesgblock="$1"
shift
remaining_args=("$@") #for on_exit() above
echo '--------'
echo "Passed-in dmesg block($bytes_per_sector bytes per block) number: '$dmesgblock'"
#have to handle the case when $dmesgblock is empty and when it's negative eg. "-1" so using a default value(of 0) when unset in the below 'if' block will help not break the 'test' expecting an integer while also allowing negative numbers ("0$dmesgblock" would otherwise yield "0-1" a non-integer):
if test "${dmesgblock:-0}" -le "0"; then
  echo "Bad passed-in dmesg block number: '$dmesgblock'" >&2
  exit 1
fi

#TODO: check and fail(or warn? nah, it should be fail!) if not a multiplier (eg. use modullo? is it "%" ?)
block=$(( $dmesgblock / 8 ))
if ! test "${block:--1}" -ge "0"; then
  echo "Badly computed device block number: '$block'" >&2
  exit 1
fi

echo "Actual block number(of $blocksize bytes per block): $block"
inode="$(echo "open ${dev}"$'\n'"icheck ${block}"$'\n'"close" | $sudo debugfs -f - 2>/dev/null | tail -n2|head -1|cut -f2 -d$'\t')"
if test "<block not found>" == "$inode"; then
  echo "No inode was found for the provided dmesg block number '$dmesgblock' which mapped to dev block number '$block'" >&2
  exit 1
else
    #assuming number TODO: check for this!
    echo "Found inode: $inode"
    fpath="$(echo "open ${dev}"$'\n'"ncheck ${inode}"$'\n'"close" | $sudo debugfs -f - 2>/dev/null | tail -n2|head -1|cut -f2- -d$'\t')"
  #fpath always begins with '/' right?
    if test "$fpath" != "Pathname"; then
        echo "Found path : $fpath"
    else
        echo "not found"
    fi
fi
done
以下是处理整个dmesg日志的脚本:

#!/bin/bash

#./showallblocks rev.01 rewritten for question/answer from: https://stackoverflow.com/q/52058914/10239615

if test "`id -u`" != "0"; then
    sudo='sudo'
else
    sudo=''
fi

dmesglog="$1"
if test -z "$dmesglog"; then
  echo "Usage: '$0' <dmesglogfile>"
  echo "Examples:"
  echo "sudo dmesg > dmesg1.log && '$0' dmesg1.log"
  echo "'$0' <(sudo dmesg)"
  #Note: '$0' used for the case when $0 has spaces or other things in its path names, and user wants to copy paste, for whatever reason, the output of the above into the command line.
  exit 1
fi

#(optional) Stop logging if already in progress:
$sudo sysctl -w vm.block_dump=0

#Using the answer from here(thanks to glenn jackman): https://unix.stackexchange.com/a/467377/306023
#grep --color=never -E -- 'READ block [0-9]+ on xvda3' "$dmesglog" |
#cat "$dmesglog" |
$sudo perl -pe '
if (! /READ block [0-9]+ on [A-Za-z0-9]+ .*$/) {
  s{.*}{}s
}

s{(READ block) (\d+) (on) ([A-Za-z0-9]+) ([^\$]*)\n$}
{join " ",$1, $2, $3, $4, $5, qx(./showblock "/dev/$4" "$2" | grep -F -- "Found path :" | cut -f4- -d" ")}es
' -- "$dmesglog"
#Note: the output of qx(...) above is purposely allowed to have trailing newline! (I did wonder if purposely is correct here or it should be purposefully, https://www.merriam-webster.com/words-at-play/purposely-purposefully-usage )
#To find out what "}es"(above) is, see perlre modifiers: https://perldoc.perl.org/perlre.html#Modifiers
#FIXME: noobish try to exclude lines not matching the lines that need to be replaced, from output! used 'if' above

我刚刚意识到设备上的块(
sudo blockdev--getbsz/dev/xvda3
==4096字节/块)不同于进程
rustc
(Rust语言编译器)的ext4文件系统上的块(字节/块)在构建Firefox的过程中,我得到了一些inode,这些inode映射到对
rustc
访问没有意义的文件:
/usr/lib64/guile/2.0/ccache/ice-9/history.go
/usr/lib/python2.7/site packages/salt/modules/win\u lgpo.py
/usr/lib/python2.7/site packages/salt/modules/zonecfg.py
,以后,您应该发布您的解决方案作为答案,而不是修改问题,使之遵循一般的风格。@Mikko谢谢您,顺便说一句!我发现他们已经在
An附近链接到它(文档)