Kubernetes Rookv1.2不在CrushMap上添加标签

Kubernetes Rookv1.2不在CrushMap上添加标签,kubernetes,storage,ceph,kubernetes-rook,crush,Kubernetes,Storage,Ceph,Kubernetes Rook,Crush,我目前正在使用Rook v1.2.2在Kubernetes集群(v1.16.3)上创建Ceph集群,但我无法在CrushMap上添加机架级别 我想从: ID CLASS WEIGHT TYPE NAME -1 0.02737 root default -3 0.01369 host test-w1 0 hdd 0.01369 osd.0 -5 0.01369 host test-w2 1 hdd 0.01369

我目前正在使用Rook v1.2.2在Kubernetes集群(v1.16.3)上创建Ceph集群,但我无法在CrushMap上添加机架级别

我想从:

ID CLASS WEIGHT  TYPE NAME
-1       0.02737 root default
-3       0.01369     host test-w1
 0   hdd 0.01369         osd.0
-5       0.01369     host test-w2
 1   hdd 0.01369         osd.1
例如:

ID CLASS WEIGHT  TYPE NAME                 STATUS REWEIGHT PRI-AFF
-1       0.01358 root default
-5       0.01358     zone zone1
-4       0.01358         rack rack1
-3       0.01358             host mynode
 0   hdd 0.00679                 osd.0         up  1.00000 1.00000
 1   hdd 0.00679                 osd.1         up  1.00000 1.00000
就像官方的rook文档()中解释的那样

我采取了以下步骤:

我有一个1.16.3版的Kubernetes集群,有一个主集群(test-m1)和两个工作集群(test-w1和test-w2)。 我使用默认配置Kubespray()安装了这个集群

我将节点标记为:

kubectl label node test-w1 topology.rook.io/rack=rack1
kubectl label node test-w2 topology.rook.io/rack=rack2
我添加了标签
role=storage node
和污点
storage node=true:NoSchedule
以强制Rook在特定存储节点上执行,下面是一个存储节点的标签和污点的完整示例:

Name:               test-w1
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=test-w1
                    kubernetes.io/os=linux
                    role=storage-node
                    topology.rook.io/rack=rack1
Annotations:        csi.volume.kubernetes.io/nodeid: {"rook-ceph.cephfs.csi.ceph.com":"test-w1","rook-ceph.rbd.csi.ceph.com":"test-w1"}
                    kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 29 Jan 2020 03:38:52 +0100
Taints:             storage-node=true:NoSchedule
使用此配置时,Rook不会在粉碎贴图上应用标签。 如果我安装toolbox.yml(),请进入并运行它

ceph osd tree
ceph osd crush tree
我有以下输出:

ID CLASS WEIGHT  TYPE NAME
-1       0.02737 root default
-3       0.01369     host test-w1
 0   hdd 0.01369         osd.0
-5       0.01369     host test-w2
 1   hdd 0.01369         osd.1
如您所见,没有定义机架。即使我正确标记了节点

令人惊讶的是,pods prepare osd可以检索以下日志第一行中的信息:

$ kubectl logs rook-ceph-osd-prepare-test-w1-7cp4f -n rook-ceph

2020-01-29 09:59:07.272649 I | cephcmd: crush location of osd: root=default host=test-w1 rack=rack1
[couppayy@test-m1 test_local]$ cat preposd.txt
2020-01-29 09:59:07.155656 I | cephcmd: desired devices to configure osds: [{Name: OSDsPerDevice:1 MetadataDevice: DatabaseSizeMB:0 DeviceClass: IsFilter:false IsDevicePathFilter:false}]
2020-01-29 09:59:07.185024 I | rookcmd: starting Rook v1.2.2 with arguments '/rook/rook ceph osd provision'
2020-01-29 09:59:07.185069 I | rookcmd: flag values: --cluster-id=c9ee638a-1d02-4ad9-95c9-cb796f61623a, --data-device-filter=, --data-device-path-filter=, --data-devices=, --data-directories=/var/lib/rook, --encrypted-device=false, --force-format=false, --help=false, --location=, --log-flush-frequency=5s, --log-level=INFO, --metadata-device=, --node-name=test-w1, --operator-image=, --osd-database-size=0, --osd-journal-size=5120, --osd-store=, --osd-wal-size=576, --osds-per-device=1, --pvc-backed-osd=false, --service-account=
2020-01-29 09:59:07.185108 I | op-mon: parsing mon endpoints: a=10.233.35.212:6789
2020-01-29 09:59:07.272603 I | op-osd: CRUSH location=root=default host=test-w1 rack=rack1
2020-01-29 09:59:07.272649 I | cephcmd: crush location of osd: root=default host=test-w1 rack=rack1
2020-01-29 09:59:07.313099 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-01-29 09:59:07.313397 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2020-01-29 09:59:07.322175 I | cephosd: discovering hardware
2020-01-29 09:59:07.322228 I | exec: Running command: lsblk --all --noheadings --list --output KNAME
2020-01-29 09:59:07.365036 I | exec: Running command: lsblk /dev/sda --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.416812 W | inventory: skipping device sda: Failed to complete 'lsblk /dev/sda': exit status 1. lsblk: /dev/sda: not a block device
2020-01-29 09:59:07.416873 I | exec: Running command: lsblk /dev/sda1 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.450851 W | inventory: skipping device sda1: Failed to complete 'lsblk /dev/sda1': exit status 1. lsblk: /dev/sda1: not a block device
2020-01-29 09:59:07.450892 I | exec: Running command: lsblk /dev/sda2 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.457890 W | inventory: skipping device sda2: Failed to complete 'lsblk /dev/sda2': exit status 1. lsblk: /dev/sda2: not a block device
2020-01-29 09:59:07.457934 I | exec: Running command: lsblk /dev/sr0 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.503758 W | inventory: skipping device sr0: Failed to complete 'lsblk /dev/sr0': exit status 1. lsblk: /dev/sr0: not a block device
2020-01-29 09:59:07.503793 I | cephosd: creating and starting the osds
2020-01-29 09:59:07.543504 I | cephosd: configuring osd devices: {"Entries":{}}
2020-01-29 09:59:07.543554 I | exec: Running command: ceph-volume lvm batch --prepare
2020-01-29 09:59:08.906271 I | cephosd: no more devices to configure
2020-01-29 09:59:08.906311 I | exec: Running command: ceph-volume lvm list  --format json
2020-01-29 09:59:10.841568 I | cephosd: 0 ceph-volume osd devices configured on this node
2020-01-29 09:59:10.841595 I | cephosd: devices = []
2020-01-29 09:59:10.847396 I | cephosd: configuring osd dirs: map[/var/lib/rook:-1]
2020-01-29 09:59:10.848011 I | exec: Running command: ceph osd create 652071c9-2cdb-4df9-a20e-813738c4e3f6 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/851021116
2020-01-29 09:59:14.213679 I | cephosd: successfully created OSD 652071c9-2cdb-4df9-a20e-813738c4e3f6 with ID 0
2020-01-29 09:59:14.213744 I | cephosd: osd.0 appears to be new, cleaning the root dir at /var/lib/rook/osd0
2020-01-29 09:59:14.214417 I | cephconfig: writing config file /var/lib/rook/osd0/rook-ceph.config
2020-01-29 09:59:14.214653 I | exec: Running command: ceph auth get-or-create osd.0 -o /var/lib/rook/osd0/keyring osd allow * mon allow profile osd --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format plain
2020-01-29 09:59:17.189996 I | cephosd: Initializing OSD 0 file system at /var/lib/rook/osd0...
2020-01-29 09:59:17.194681 I | exec: Running command: ceph mon getmap --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/298283883
2020-01-29 09:59:20.936868 I | exec: got monmap epoch 1
2020-01-29 09:59:20.937380 I | exec: Running command: ceph-osd --mkfs --id=0 --cluster=rook-ceph --conf=/var/lib/rook/osd0/rook-ceph.config --osd-data=/var/lib/rook/osd0 --osd-uuid=652071c9-2cdb-4df9-a20e-813738c4e3f6 --monmap=/var/lib/rook/osd0/tmp/activate.monmap --keyring=/var/lib/rook/osd0/keyring --osd-journal=/var/lib/rook/osd0/journal
2020-01-29 09:59:21.324912 I | mkfs-osd0: 2020-01-29 09:59:21.323 7fc7e2a8ea80 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to
force use of aio anyway
2020-01-29 09:59:21.386136 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to
force use of aio anyway
2020-01-29 09:59:21.387553 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal do_read_entry(4096): bad header magic
2020-01-29 09:59:21.387585 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal do_read_entry(4096): bad header magic
2020-01-29 09:59:21.450639 I | cephosd: Config file /var/lib/rook/osd0/rook-ceph.config:
[global]
fsid                         = a19423a1-f135-446f-b4d9-f52da10a935f
mon initial members          = a
mon host                     = v1:10.233.35.212:6789
public addr                  = 10.233.95.101
cluster addr                 = 10.233.95.101
mon keyvaluedb               = rocksdb
mon_allow_pool_delete        = true
mon_max_pg_per_osd           = 1000
debug default                = 0
debug rados                  = 0
debug mon                    = 0
debug osd                    = 0
debug bluestore              = 0
debug filestore              = 0
debug journal                = 0
debug leveldb                = 0
filestore_omap_backend       = rocksdb
osd pg bits                  = 11
osd pgp bits                 = 11
osd pool default size        = 1
osd pool default pg num      = 100
osd pool default pgp num     = 100
osd max object name len      = 256
osd max object namespace len = 64
osd objectstore              = filestore
rbd_default_features         = 3
fatal signal handlers        = false

[osd.0]
keyring          = /var/lib/rook/osd0/keyring
osd journal size = 5120

2020-01-29 09:59:21.450723 I | cephosd: completed preparing osd &{ID:0 DataPath:/var/lib/rook/osd0 Config:/var/lib/rook/osd0/rook-ceph.config Cluster:rook-ceph KeyringPath:/var/lib/rook/osd0/keyring UUID:652071c9-2cdb-4df9-a20e-813738c4e3f6 Journal:/var/lib/rook/osd0/journal IsFileStore:true IsDirectory:true DevicePartUUID: CephVolumeInitiated:false LVPath: SkipLVRelease:false Location: LVBackedPV:false}
2020-01-29 09:59:21.450743 I | cephosd: 1/1 osd dirs succeeded on this node
2020-01-29 09:59:21.450755 I | cephosd: saving osd dir map
2020-01-29 09:59:21.479301 I | cephosd: device osds:[]
dir osds: [{ID:0 DataPath:/var/lib/rook/osd0 Config:/var/lib/rook/osd0/rook-ceph.config Cluster:rook-ceph KeyringPath:/var/lib/rook/osd0/keyring UUID:652071c9-2cdb-4df9-a20e-813738c4e3f6 Journal:/var/lib/rook/osd0/journal IsFileStore:true IsDirectory:true DevicePartUUID: CephVolumeInitiated:false LVPath: SkipLVRelease:false Location: LVBackedPV:false}]

你知道问题出在哪里吗?我该如何解决

我在这篇文章中与一位Rook开发人员讨论了这个问题:

他能够重现这个问题:

Yohan,我也能重现这样一个问题,即标签没有被OSD拾取,即使标签在OSD准备吊舱中被检测到,如您所见。你能为此打开一个GitHub问题吗?我正在调查这件事

但问题似乎只与使用目录的OSD有关,当您使用设备(如原始设备)时,问题并不存在:

Yohan,我发现这只影响在目录上创建的OSD。我建议您在原始设备上测试创建OSD,以正确填充粉碎图。在v1.3版本中,还需要注意的是,对OSD上目录的支持正在被删除。预计在该版本发布后,将在原始设备或分区上创建OSD。有关更多详细信息,请参阅本期:

由于在下一版本中将删除对目录上OSD的支持,因此我预计不会修复此问题

正如您所看到的,这个问题不会得到解决,因为目录的使用很快就会被弃用

我用原始设备而不是目录重新启动了测试,效果非常好

我要感谢特拉维斯提供的帮助和他快速的回答

ID CLASS WEIGHT  TYPE NAME
-1       0.02737 root default
-3       0.01369     host test-w1
 0   hdd 0.01369         osd.0
-5       0.01369     host test-w2
 1   hdd 0.01369         osd.1
$ kubectl logs rook-ceph-osd-prepare-test-w1-7cp4f -n rook-ceph

2020-01-29 09:59:07.272649 I | cephcmd: crush location of osd: root=default host=test-w1 rack=rack1
[couppayy@test-m1 test_local]$ cat preposd.txt
2020-01-29 09:59:07.155656 I | cephcmd: desired devices to configure osds: [{Name: OSDsPerDevice:1 MetadataDevice: DatabaseSizeMB:0 DeviceClass: IsFilter:false IsDevicePathFilter:false}]
2020-01-29 09:59:07.185024 I | rookcmd: starting Rook v1.2.2 with arguments '/rook/rook ceph osd provision'
2020-01-29 09:59:07.185069 I | rookcmd: flag values: --cluster-id=c9ee638a-1d02-4ad9-95c9-cb796f61623a, --data-device-filter=, --data-device-path-filter=, --data-devices=, --data-directories=/var/lib/rook, --encrypted-device=false, --force-format=false, --help=false, --location=, --log-flush-frequency=5s, --log-level=INFO, --metadata-device=, --node-name=test-w1, --operator-image=, --osd-database-size=0, --osd-journal-size=5120, --osd-store=, --osd-wal-size=576, --osds-per-device=1, --pvc-backed-osd=false, --service-account=
2020-01-29 09:59:07.185108 I | op-mon: parsing mon endpoints: a=10.233.35.212:6789
2020-01-29 09:59:07.272603 I | op-osd: CRUSH location=root=default host=test-w1 rack=rack1
2020-01-29 09:59:07.272649 I | cephcmd: crush location of osd: root=default host=test-w1 rack=rack1
2020-01-29 09:59:07.313099 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-01-29 09:59:07.313397 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2020-01-29 09:59:07.322175 I | cephosd: discovering hardware
2020-01-29 09:59:07.322228 I | exec: Running command: lsblk --all --noheadings --list --output KNAME
2020-01-29 09:59:07.365036 I | exec: Running command: lsblk /dev/sda --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.416812 W | inventory: skipping device sda: Failed to complete 'lsblk /dev/sda': exit status 1. lsblk: /dev/sda: not a block device
2020-01-29 09:59:07.416873 I | exec: Running command: lsblk /dev/sda1 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.450851 W | inventory: skipping device sda1: Failed to complete 'lsblk /dev/sda1': exit status 1. lsblk: /dev/sda1: not a block device
2020-01-29 09:59:07.450892 I | exec: Running command: lsblk /dev/sda2 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.457890 W | inventory: skipping device sda2: Failed to complete 'lsblk /dev/sda2': exit status 1. lsblk: /dev/sda2: not a block device
2020-01-29 09:59:07.457934 I | exec: Running command: lsblk /dev/sr0 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.503758 W | inventory: skipping device sr0: Failed to complete 'lsblk /dev/sr0': exit status 1. lsblk: /dev/sr0: not a block device
2020-01-29 09:59:07.503793 I | cephosd: creating and starting the osds
2020-01-29 09:59:07.543504 I | cephosd: configuring osd devices: {"Entries":{}}
2020-01-29 09:59:07.543554 I | exec: Running command: ceph-volume lvm batch --prepare
2020-01-29 09:59:08.906271 I | cephosd: no more devices to configure
2020-01-29 09:59:08.906311 I | exec: Running command: ceph-volume lvm list  --format json
2020-01-29 09:59:10.841568 I | cephosd: 0 ceph-volume osd devices configured on this node
2020-01-29 09:59:10.841595 I | cephosd: devices = []
2020-01-29 09:59:10.847396 I | cephosd: configuring osd dirs: map[/var/lib/rook:-1]
2020-01-29 09:59:10.848011 I | exec: Running command: ceph osd create 652071c9-2cdb-4df9-a20e-813738c4e3f6 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/851021116
2020-01-29 09:59:14.213679 I | cephosd: successfully created OSD 652071c9-2cdb-4df9-a20e-813738c4e3f6 with ID 0
2020-01-29 09:59:14.213744 I | cephosd: osd.0 appears to be new, cleaning the root dir at /var/lib/rook/osd0
2020-01-29 09:59:14.214417 I | cephconfig: writing config file /var/lib/rook/osd0/rook-ceph.config
2020-01-29 09:59:14.214653 I | exec: Running command: ceph auth get-or-create osd.0 -o /var/lib/rook/osd0/keyring osd allow * mon allow profile osd --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format plain
2020-01-29 09:59:17.189996 I | cephosd: Initializing OSD 0 file system at /var/lib/rook/osd0...
2020-01-29 09:59:17.194681 I | exec: Running command: ceph mon getmap --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/298283883
2020-01-29 09:59:20.936868 I | exec: got monmap epoch 1
2020-01-29 09:59:20.937380 I | exec: Running command: ceph-osd --mkfs --id=0 --cluster=rook-ceph --conf=/var/lib/rook/osd0/rook-ceph.config --osd-data=/var/lib/rook/osd0 --osd-uuid=652071c9-2cdb-4df9-a20e-813738c4e3f6 --monmap=/var/lib/rook/osd0/tmp/activate.monmap --keyring=/var/lib/rook/osd0/keyring --osd-journal=/var/lib/rook/osd0/journal
2020-01-29 09:59:21.324912 I | mkfs-osd0: 2020-01-29 09:59:21.323 7fc7e2a8ea80 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to
force use of aio anyway
2020-01-29 09:59:21.386136 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to
force use of aio anyway
2020-01-29 09:59:21.387553 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal do_read_entry(4096): bad header magic
2020-01-29 09:59:21.387585 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal do_read_entry(4096): bad header magic
2020-01-29 09:59:21.450639 I | cephosd: Config file /var/lib/rook/osd0/rook-ceph.config:
[global]
fsid                         = a19423a1-f135-446f-b4d9-f52da10a935f
mon initial members          = a
mon host                     = v1:10.233.35.212:6789
public addr                  = 10.233.95.101
cluster addr                 = 10.233.95.101
mon keyvaluedb               = rocksdb
mon_allow_pool_delete        = true
mon_max_pg_per_osd           = 1000
debug default                = 0
debug rados                  = 0
debug mon                    = 0
debug osd                    = 0
debug bluestore              = 0
debug filestore              = 0
debug journal                = 0
debug leveldb                = 0
filestore_omap_backend       = rocksdb
osd pg bits                  = 11
osd pgp bits                 = 11
osd pool default size        = 1
osd pool default pg num      = 100
osd pool default pgp num     = 100
osd max object name len      = 256
osd max object namespace len = 64
osd objectstore              = filestore
rbd_default_features         = 3
fatal signal handlers        = false

[osd.0]
keyring          = /var/lib/rook/osd0/keyring
osd journal size = 5120

2020-01-29 09:59:21.450723 I | cephosd: completed preparing osd &{ID:0 DataPath:/var/lib/rook/osd0 Config:/var/lib/rook/osd0/rook-ceph.config Cluster:rook-ceph KeyringPath:/var/lib/rook/osd0/keyring UUID:652071c9-2cdb-4df9-a20e-813738c4e3f6 Journal:/var/lib/rook/osd0/journal IsFileStore:true IsDirectory:true DevicePartUUID: CephVolumeInitiated:false LVPath: SkipLVRelease:false Location: LVBackedPV:false}
2020-01-29 09:59:21.450743 I | cephosd: 1/1 osd dirs succeeded on this node
2020-01-29 09:59:21.450755 I | cephosd: saving osd dir map
2020-01-29 09:59:21.479301 I | cephosd: device osds:[]
dir osds: [{ID:0 DataPath:/var/lib/rook/osd0 Config:/var/lib/rook/osd0/rook-ceph.config Cluster:rook-ceph KeyringPath:/var/lib/rook/osd0/keyring UUID:652071c9-2cdb-4df9-a20e-813738c4e3f6 Journal:/var/lib/rook/osd0/journal IsFileStore:true IsDirectory:true DevicePartUUID: CephVolumeInitiated:false LVPath: SkipLVRelease:false Location: LVBackedPV:false}]