Linux Access Gateway crashing in removeFromConnectionList ()

  • 7005807
  • 21-Apr-2010
  • 26-Apr-2012

Environment

Novell Access Manager 3.1 Linux Access Gateway
Novell Access Manager 3.1 Support Pack 1 Interim Release 2 applied

Situation

Linux Access Gateway (LAG) setup to accelerate multiple protected resources. After upgrading from Access Manager 3.1 Support Pack 1 Interim Release 1 (SP1 IR1) to SP1 IR2, the LAG would restart every few days. Default logging settings were applied to the /etc/laglogs.conf file so no useful information on what the LAG was processing at the time of the restart was available from the ics_dyn.log file.

After forcing the coredumps using the /tmp/.dumpcore, multiple coredumps were produced showing the following backtrace

#0  nkEnterDebugger () at nksutil.c:693
#1  0x456938bf in cbType::removeFromConnectionList (
    this=<value optimized out>, listHead=<value optimized out>)
    at /home/ajkumar/311ir2/legacy/s_proxy/connect.c:5676
#2  0x45692c68 in TCPPacketReceived (cb=0x94a1f020) at nconnect.h:1960
#3  0x4568f28e in cbType::dataArrived (this=0x94a1f020, seg=0x9f19b340)
    at nconnect.h:1113
#4  0x4170a873 in LegacySlanConnection::dataReceived (this=0xabe742e8,
    bufSeg=0x9f19b340, amountToRead=29664)
    at /home/ajkumar/311ir2/vcp/s_cnmgr/LegacySlanConnection.c:329
#5  0x417144c8 in ConnectionSocket::recvData (this=0xaf56dac4,
    amountToRead=32776)
    at /home/ajkumar/311ir2/vcp/s_cnmgr/ConnectionSocket.cpp:710
#6  0x41719266 in LegacyConnectionSocket::processConnectionEvent (
    this=0xaf56dac4, event=0xaf56db38)
    at /home/ajkumar/311ir2/vcp/s_cnmgr/LegacyConnectionSocket.cpp:97
#7  0x41712315 in EpollEventQueue::connectionCallback (param=0xffffffe0)
    at /home/ajkumar/311ir2/vcp/s_cnmgr/PollEventListener.cpp:536
#8  0x40026a72 in _ExecuteWork (thread=0x4015e240, work=0xaf56dae4)
    at sysapi.c:573
#9  0x40026ba7 in _WorkThreadMain (param=0x40030684) at sysapi.c:737
#10 0x40023a19 in threadMain (args=0x4015e240) at nksthread.c:156
#11 0x402bdcb7 in start_thread () from /lib/tls/libpthread.so.0
#12 0x4025821e in clone () from /lib/tls/libc.so.6
#13 0x41619bb0 in ?? ()

Resolution

Fixed with Access Manager 3.1 SP1 IR3a.

Additional Information

To view the stack dump ('bt') from a coredump on the LAG, do the following:

# cd /chroot/lag
# chroot /chroot/lag
# gdb opt/novell/bin/ics_dyn <$core.pid> where core.pid is the name of the core file in that directory
# bt