dell服务器硬件监控软件openmanager,可以对电池,主板,温度,以及硬盘等等进行监测。安装使用可以参见《Dell服务器安装OpenManage(OMSA)》
在监测中,可能会遇到下面的错误信息:文章源自运维生存时间-https://www.ttlsa.com/linux/infomemory-module-dimm-needs-attention-single-bit-warning-error-rate-exceeded-single-bit-failure-err/
Info:Memory module 6 [DIMM7, 2048 MB] needs attention: Single-bit warning error rate exceeded, Single-bit failure error rate exceeded文章源自运维生存时间-https://www.ttlsa.com/linux/infomemory-module-dimm-needs-attention-single-bit-warning-error-rate-exceeded-single-bit-failure-err/
这说明内存监测有问题了,可能内存松动等等,但是系统还是可以认到该条内存的。需要关机重新拔插下该条内存的。由于需要停服关机,会影响业务的。但是该问题会一直报警的。收到报警信息大家都会觉得烦躁的。可以对内存的监控屏蔽掉,方法如下:文章源自运维生存时间-https://www.ttlsa.com/linux/infomemory-module-dimm-needs-attention-single-bit-warning-error-rate-exceeded-single-bit-failure-err/
check_openmanage --check storage -b dimm=all
文章源自运维生存时间-https://www.ttlsa.com/linux/infomemory-module-dimm-needs-attention-single-bit-warning-error-rate-exceeded-single-bit-failure-err/
可以看到内存和Voltage没有检测。文章源自运维生存时间-https://www.ttlsa.com/linux/infomemory-module-dimm-needs-attention-single-bit-warning-error-rate-exceeded-single-bit-failure-err/
不加dimm=all就会检测内存的。文章源自运维生存时间-https://www.ttlsa.com/linux/infomemory-module-dimm-needs-attention-single-bit-warning-error-rate-exceeded-single-bit-failure-err/
文章源自运维生存时间-https://www.ttlsa.com/linux/infomemory-module-dimm-needs-attention-single-bit-warning-error-rate-exceeded-single-bit-failure-err/
可以对相关硬件检测进行屏蔽掉。如温度检测等等。如:文章源自运维生存时间-https://www.ttlsa.com/linux/infomemory-module-dimm-needs-attention-single-bit-warning-error-rate-exceeded-single-bit-failure-err/
/usr/local/nagios/libexec/check_openmanage --check storage -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all/bat_charge=all/encl=all/ps=all/fan=all/temp=all/volt=all
转载请注明来自运维生存时间: https://www.ttlsa.com/html/3880.html文章源自运维生存时间-https://www.ttlsa.com/linux/infomemory-module-dimm-needs-attention-single-bit-warning-error-rate-exceeded-single-bit-failure-err/ 文章源自运维生存时间-https://www.ttlsa.com/linux/infomemory-module-dimm-needs-attention-single-bit-warning-error-rate-exceeded-single-bit-failure-err/
评论