I do not know how true is that, customer always complaint to me that Dell Server is very sensitive to memory Single-bit Error (SBE) compared to HP and IBM server. Some Dell server with the role of File or Database Server with heavy I/O, which are the most easy to encounter this error/warning at the system status LCD Panel as below:
"E2111 SBE log disabled on DIMM ##. Reseat DIMM."
![]() |
LCD Panel |
"..the memory error logging is as per design, and it wouldn't clear off automatically even if the SBE has been handled by the ECC Memory, unless we clear the hardware error log (can clear through OpenManage Server Administrator(OMSA) or Dell System E-Support Tool (DSET)) or till the system reset. This is because many customer/user would like to use those error for future reference..."However, most of the customers/users do not appreciate the purpose of the design, and always claim that Dell Server's quality cannot compete with other brand, just because of the motherboard sensor is capable to log or display the error which happened before. Sounds funny isn't it?
Troubleshooting steps:
1. For Microsoft Windows-based & RedHat Linux OS, you may clear the hardware log from OMSA as below:
i) Log in to OMSA (with Administrative rights)2. You may use the Dell System E-Support Tool (DSET) to clear the ESM Log as well. As simple as execute the file, and select "Create & Clear ESM Log"
ii) System -> Log (tab) -> Clear ESM Log
Please refers previous post for more info:
http://alanitsolutions.blogspot.com/2011/02/introduction-to-dell-system-e-support.html
3. If you are using non-OS such as VMWare ESX, you may chose to clear log with a bootable Dell utility named Dell OpenManage LiveCD.
Please refers previous post for more info:
http://alaninpenang.blogspot.com/2010/04/dell-server-how-to-capture-hardware-log.html
After the hardware log cleared with one of the abovementioned steps, what you need to do next is just need need to monitor for 1-2 hours. If the SBE is actual, definitely it will return within this period. If it is not, which mean error already handled by the ECC system memory.
0 comments:
Post a Comment