Android repo文件损坏问题-随机将下划线替换为大写O
我有一个相当特殊的腐败问题,让我们先列出情况: 主机硬件:Android repo文件损坏问题-随机将下划线替换为大写O,android,linux,git,corruption,kvm,Android,Linux,Git,Corruption,Kvm,我有一个相当特殊的腐败问题,让我们先列出情况: 主机硬件: 超级微型超级服务器A+2022G-URF 仅安装1个CPU:AMD Opteron 6348 RAM:4x三星8GB DDR3 PC3-12800 CL11(M393B1K70DH0-CK0)=32GB 硬盘:4倍希捷星座ES.3 SAS 2.0 1TB(ST1000NM0023)作为系统硬盘(运行状况:良好) 带BBU的LSI MegaRaid 9260-8i RAID控制器(运行状况:良好) 操作系统:Ubuntu 14.04服务
- 超级微型超级服务器A+2022G-URF
- 仅安装1个CPU:AMD Opteron 6348
- RAM:4x三星8GB DDR3 PC3-12800 CL11(M393B1K70DH0-CK0)=32GB
- 硬盘:4倍希捷星座ES.3 SAS 2.0 1TB(ST1000NM0023)作为系统硬盘(运行状况:良好)
- 带BBU的LSI MegaRaid 9260-8i RAID控制器(运行状况:良好)
- 操作系统:Ubuntu 14.04服务器
<domain type='kvm'>
<name>buildbot</name>
<uuid>long-uuid-that-is-of-no-interest-here</uuid>
<description>buildbot</description>
<memory unit='KiB'>14729216</memory>
<currentMemory unit='KiB'>14729216</currentMemory>
<vcpu placement='static' cpuset='2-11'>10</vcpu>
<os>
<type arch='x86_64' machine='pc-i440fx-1.5'>hvm</type>
<boot dev='hd'/>
<bootmenu enable='yes'/>
</os>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<cpu mode='custom' match='exact'>
<model fallback='allow'>Opteron_G5</model>
<vendor>AMD</vendor>
<feature policy='require' name='perfctr_core'/>
<feature policy='require' name='skinit'/>
<feature policy='require' name='perfctr_nb'/>
<feature policy='require' name='mmxext'/>
<feature policy='require' name='osxsave'/>
<feature policy='require' name='vme'/>
<feature policy='require' name='topoext'/>
<feature policy='require' name='fxsr_opt'/>
<feature policy='require' name='bmi1'/>
<feature policy='require' name='ht'/>
<feature policy='require' name='cr8legacy'/>
<feature policy='require' name='ibs'/>
<feature policy='require' name='wdt'/>
<feature policy='require' name='extapic'/>
<feature policy='require' name='osvw'/>
<feature policy='require' name='nodeid_msr'/>
<feature policy='require' name='tce'/>
<feature policy='require' name='cmp_legacy'/>
<feature policy='require' name='lwp'/>
<feature policy='require' name='monitor'/>
</cpu>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>restart</on_crash>
<devices>
<emulator>/usr/bin/kvm-spice</emulator>
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<target dev='hdc' bus='ide'/>
<readonly/>
<address type='drive' controller='0' bus='1' target='0' unit='0'/>
</disk>
<disk type='file' device='disk'>
<driver name='qemu' type='raw' cache='writethrough' io='native'/>
<source file='/var/lib/libvirt/images/disk.img'/>
<target dev='vda' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</disk>
<controller type='usb' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
</controller>
<controller type='pci' index='0' model='pci-root'/>
<controller type='ide' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
</controller>
<controller type='scsi' index='0' model='virtio-scsi'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</controller>
<interface type='direct'>
<mac address='52:54:00:XX:XX:XX'/>
<source dev='eth0' mode='bridge'/>
<model type='e1000'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<serial type='pty'>
<target port='0'/>
</serial>
<console type='pty'>
<target type='serial' port='0'/>
</console>
<input type='mouse' bus='ps2'/>
<input type='keyboard' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes'/>
<video>
<model type='cirrus' vram='9216' heads='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<memballoon model='virtio'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</memballoon>
</devices>
</domain>
getOrate(大写O不是0)应该是get_rate
Git不认为这是更改或损坏(Git status
、Git diff
和Git fsck——完整的
都是干净的),除非更改后获得费率并返回getOrate。Git也不会注意到或通过Git reset--hard
,Git reset--hard m/branch
,Git fetch remote&&Git reset--hard remote/branch
来纠正它。这不是发生这种情况的唯一文件,它只是一个示例。它发生在具有随机事件的随机文件中,甚至在一个文件中仅发生1次,而不考虑其他get_rate事件。一个新的处女系统的结帐已经成为这个问题的牺牲品
但是,如果git存储库是在同一台或另一台机器上单独克隆的,没有repo,那么问题就不存在了
已经尝试了各种HDD模拟器/存储格式和IO/缓存策略。对来宾和主机以及HDD的RAM进行了多次检查,未发现任何错误。服务器和虚拟机都已多次重新设置,问题迟早会重新出现
回购版本:
VERSION=(1,21)
git版本:
git版本1.9.1
谷歌没有提出任何解决方案(很难用谷歌搜索s/uu/O损坏问题),在检查硬件、配置电缆、模块安装等近一周的错误后。。。我已经智穷了
任何指针都将不胜感激。多亏了@torek(),我在LSI的.120固件更新日志中发现了一个单位损坏问题-以前的固件在某些情况下无法识别磁盘缓存损坏
因为我已经更新了它,所以我将它更新为最新的可用版本:.130。(这必须一步一步地完成,当前->.120->.130,因为我的控制器的旧固件无法识别.130,导致“固件损坏”问题)
固件更新后,我终于发现了一些检测错误:
05/25/14 12:07:46: EVT#00420-05/25/14 12:07:46: 113=Unexpected sense: PD 0c(e0xfc/s2) Path 4433221101000000, CDB: 1a 08 c8 00 ff 00, Sense: 5/24/00
05/25/14 12:07:46: Raw Sense for PD c: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00 00 00
05/25/14 12:07:46: PdWriteCacheSet: Error=2 issuing MODE SENSE to pd=0c
最后,在关闭受影响的驱动器后(!3!个新驱动器受影响),损坏问题神奇地消失了:)
因此,我猜这些驱动器上的磁盘缓存已损坏/出现故障,导致了一个位错误。
我昨天把它们换了,但是我的3个备用零件已经很好地工作了(直到现在,尽量不让我倒霉)
作为一项预防措施,一旦军事革命的替代品到达,我也将把第四批换掉,因为它与其他有这个特殊问题的批次是同一批
编辑:
提示:
LSI控制器的更详细的技术日志可以通过
MegaCli64-fwtermlog-dsply-aX
(X=适配器id)访问,因为它的价值是,\uucode>是ASCII5F
,而O
是ASCII4F
,所以这是一个单位更改(在一个特定字节中清除位0x10
)。但它这样做很奇怪。@torek谢谢你!这实际上很有价值:)特别是关键字“single bit”+corruption帮助我找到了解决方案,请看我刚刚发布并接受的答案。
05/25/14 12:07:46: EVT#00420-05/25/14 12:07:46: 113=Unexpected sense: PD 0c(e0xfc/s2) Path 4433221101000000, CDB: 1a 08 c8 00 ff 00, Sense: 5/24/00
05/25/14 12:07:46: Raw Sense for PD c: 70 00 05 00 00 00 00 0a 00 00 00 00 24 00 00 00 00 00
05/25/14 12:07:46: PdWriteCacheSet: Error=2 issuing MODE SENSE to pd=0c