NMI watchdog: BUG: soft lockup errors after upgrading to OES2018 SP2

  • 7024996
  • 29-Jan-2021
  • 01-Feb-2021

Environment

Open Enterprise Server 2018 SP2 (OES 2018 SP2) Linux

Situation

There's a possibility that after upgrading an OES2018SP1 server to OES2018SP2 that the server will start generating errors such as this:
NMI watchdog: BUG: soft lockup - CPU#xx stuck for yys [process-name:pid],   such as this:
NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [novell-named:10700]

NOTE: in the example above - the cpu# and process will vary, if the condition listed here is reached, neither of these are actually related to the problem

If this condition is left untouched, it's possible that the server will hang requiring a hard reboot.



Resolution

An issue has been identified with the Trend Micro agent causing this to occur on OES2018SP2.  A test to see if this is the issue, and as a temporary work-around to the issue run the following two commands:
systemctl stop ds_agent.service
systemctl disable ds_agent.service

A reboot may be required to see this problem go away since the service listed in the error message may not properly recovery from the error.


Cause

Trend Micro is currently working on this issue, please reach out to them if the resolution eliminated the error messages.