Kernel OOPS in NSS cacheAllocBufferForUserData.

  • 7015357
  • 14-Jul-2014
  • 20-Oct-2014

Environment

Novell Open Enterprise Server 11 (OES 11) Linux Support Pack 1
Novell Open Enterprise Server 11 (OES 11) Linux Support Pack 2

Situation

Shortly following a nightly reboot, a cluster node was crashing with a kernel OOPS in cacheAllocBufferForUserData.

The backtrace of the kernel core revealed the following  stack trace:

crash> bt
PID: 14230  TASK: ffff883f2009a600  CPU: 5   COMMAND: "nss 62"
 #0 [ffff883f2009d430] machine_kexec at ffffffff8102bf7e
 #1 [ffff883f2009d480] crash_kexec at ffffffff810abe8a
 #2 [ffff883f2009d550] oops_end at ffffffff81462638
 #3 [ffff883f2009d570] __bad_area_nosemaphore at ffffffff81038745
 #4 [ffff883f2009d630] do_page_fault at ffffffff81464b8e
 #5 [ffff883f2009d730] page_fault at ffffffff81461665
    [exception RIP: cacheAlloc+1207]
    RIP: ffffffffa07668c0  RSP: ffff883f2009d7e0  RFLAGS: 00010206
    RAX: ffffc9007bf6ded8  RBX: ffffc9007bf6ded8  RCX: ffffffffa07a8758
    RDX: ffffc9007bf6de58  RSI: ffffffffa07a87c8  RDI: ffffffffa0674ee8
    RBP: ffffc9007bf6ded8   R8: 0000000000000000   R9: 0000000000000000
    R10: ffff883f7f19f410  R11: ffff883f6abcc960  R12: 0000000000000000
    R13: ffff881880efc4b0  R14: ffffc9007bf6dfa8  R15: ffffc9007bf6dfa8
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #6 [ffff883f2009d988] cacheAllocBufferForUserData at ffffffffa0766c79 [nss]
 #7 [ffff883f2009d998] ZFSVOL_VOL_getFileBlk at ffffffffa09fcfdd [nsszlss]
 #8 [ffff883f2009da38] ROOT_BST_GetFileBlk at ffffffffa08662fc [nsscomn]
 #9 [ffff883f2009da48] COMN_GetFileBlkOrHole at ffffffffa080446b [nsscomn]
#10 [ffff883f2009daa8] CM_fetchFileBlk at ffffffffa089584b [nsscomn]
#11 [ffff883f2009dae8] ALGOMGR_fetchStreamBuf at ffffffffa088dfb1 [nsscomn]
#12 [ffff883f2009db78] NSSCCDGetWriteCacheBlock at ffffffffa089b19b [nsscomn]
#13 [ffff883f2009dbb8] CCDGetNewTempBlock at ffffffffa0884f9c [nsscomn]
#14 [ffff883f2009dbc8] CCDAnalyzeFile at ffffffffa088775a [nsscomn]
#15 [ffff883f2009dc68] CCDCompressFile at ffffffffa0888189 [nsscomn]
#16 [ffff883f2009dd08] NWALGO_compressStream at ffffffffa089ac93 [nsscomn]
#17 [ffff883f2009dea8] ALGOMGR_invokeCompAlgo at ffffffffa088e1ff [nsscomn]
#18 [ffff883f2009dee8] CM_activityWorkToDoRun at ffffffffa088d822 [nsscomn]
#19 [ffff883f2009def8] do_work at ffffffffa076a1aa [nss]
#20 [ffff883f2009df28] startThread at ffffffffa063e25f [nsslibrary]
#21 [ffff883f2009df48] kernel_thread_helper at ffffffff81469fe4
crash>


Resolution

The solution to this issue was released with the September 2014 Scheduled Maintenance patches.

Cause

Kernel panic was caused due to buffer corruption.