How and when is a data point stored in the AppManager agent's local repository? (NETIQKB39756)

  • 7739756
  • 02-Feb-2007
  • 14-Mar-2011

Environment

NetIQ AppManager 6.x
NetIQ AppManager 7.0.x

Situation

How and when is a data point stored in the AppManager agent's local repository?

Resolution

When a data point is collected by a running job on an agent, it is first placed into the MAPQUE, which is a cache file between the NetIQmc and NetIQccm processes.  This operation can be seen as follows in the MCTRACE.LOG:

The job collects the data value:

1084306862 [908] job-logDynaData: log dyna data, stream=<0> legend=<ASP-Requests/sec-^^#> ret=1
1084306862 [908] myMakeDataLog: sending for job <11371_NETIQ1046975503_1046975503>, streamid=0 stream=<>
1084306862 [908] commcenter-cSend: send msg <DATALOG> to <10.1.1.100> via ccm

The MC opens/locks the MAPQUE and writes the record:

*-1084306862 [908] lockMap: request to lock map <NETIQ1046975503_1046975503>
*-1084306862 [908] lockMap: got the lock for <NETIQ1046975503_1046975503>
1084306862 [908] myWriteMap: callback invoked for DATALOG msg
*-1084306862 [908] getNextMapPos: rpos=2c to=38 curpos=2c size=12
*-1084306862 [908] getNextMapPos: rpos=2c to=40 curpos=38 size=8
*-1084306862 [908] getNextMapPos: rpos=2c to=70 curpos=40 size=48
*-1084306862 [908] getNextMapPos: rpos=2c to=71 curpos=70 size=1
*-1084306862 [908] putMapNextRec: write a record at 44, nextpos=113
*-1084306862 [908] writeMap: set writep=71
1084306862 [908] myWriteMap: write record <DATALOG> done with 1

The MC unlocks the MAPQUE and signals completion

*-1084306862 [908] unlockMap: request to unlock map <NETIQ1046975503_1046975503>
*-1084306862 [908] unlockMap: release the lock for <NETIQ1046975503_1046975503>
1084306862 [908] quque-qWriteMap: write record of size 49 to map with rc 1
1084306862 [908] mcextcore-DynaCollectData: done with ret 1
1084306862 [908] DynaCollectData: done with rc 1


The NetIQCCM process then polls the MAPQUE for any new data points.  this can be seen in the ccmtrace.log, as follows:

The CCM recognizes the new point, and places it into a buffer:

1084306867 [1004] cmque-qread: 1 records read from que <NETIQ1046975503_1046975503>
1084306867 [1004] cmque-qread: 1 records copied to buf list
1084306867 [1004] comm-srvThreadMain: read 1 records from site <NETIQ1046975503_1046975503> que
1084306867 [1004] comm-cChkUpload: upload not enabled
1084306867 [1004] comm-cChkUpload: pause_evt=0 pause_data=0
1084306867 [1004] comm-cChkUpload: upload not enabled
1084306867 [1004] comm-cChkUpload: pause_evt=0 pause_data=0
1084306867 [1004] cmnet-nSortMsg: 1 out of 1 msgs moved

The CCM starts to contact the MS to determine if the data point can be uploaded:

1084306867 [1004] cmnet-send: calling nSendBatch()
1084306867 [1004] cmnet-sendBatch: max=0, _excpq=0, _jobcq=0, _cesysq=0, _evtq=0, _dhq=0, _ceq=0, _dataq=3226917
1084306867 [1004] cmnet-nSendQue: entering
1084306867 [1004] cmnet-nSendQue: maxbytes=0, total_byte=3226917, total_msg=1 #msg=1 from que <Data>
1084306867 [1004] cmnet-nSendQue: 1/1 msgs read from que <Data>, 0 skip

A failure is encountered in the communication link between the agent and the MS:

1084306867 [1004]cmrpcv2-netSendData: 1 fail for ms=<10.1.1.100> site=<NETIQ1046975503_1046975503> err=1721
1084306867 [1004] cmnet-nSendQue: 0/1 Data msgs sent to ms 10.1.1.100, nfail=1
1084306867 [1004] cmnet-dumpMsg: 0 EVENT msgs transferred
1084306867 [1004] cmnet-dumpMsg: 0 CTRLEVT msgs transferred
1084306867 [1004] cmnet-dumpMsg: 0 CTRLEVT msgs transferred
1084306867 [1004] cmnet-dumpMsg: 0 DATAH msgs transferred
1084306867 [1004] cmnet-dumpMsg: 1 DATA msgs transferred
1084306867 [1004] cmnet-dumpMsg: 0 EXCEP msgs transferred
108430686.
7 [1004] cmnet-dumpMsg: 0 JOBC msgs transferred

The CCM compares the buffer contents to the contents of the local reposiotry.  If none of the records came from the L-R, then insert the records:

1084306867 [1004] comm-cDelReposit: entering...
1084306867 [1004] comm-cDelReposit: none in msglist is from local repository.
1084306867 [1004] comm-cDelReposit: leaving...
1084306867 [1004] comm-cReposit: entering...
1084306867 [1004] cmdb-connect: connecting to database <Local-Repository>
1084306867 [1004] cmdb-connect: database <Local-Repository> connected
1084306867 [1004] comm-cReposit: start inserting 1 msgs into local repository
1084306867 [1004]rpdata-add: insert for jobid 11371 site <NETIQ1046975503_1046975503>, rc=1
1084306867 [1004] cmbuf-getNext: reach the end of list
1084306867 [1004] rpevent-locate: 0 records located for site <NETIQ1046975503_1046975503>
1084306867 [1004] rpevent-locate: 0 records located for site <NETIQ1046975503_1046975503>
1084306867 [1004] rpce-locate: 0 records located for site <NETIQ1046975503_1046975503>
1084306867 [1004] rpdatah-locate: 0 records located for site <NETIQ1046975503_1046975503>
1084306867 [1004] rpdata-locate: 1 records located for site <NETIQ1046975503_1046975503>
1084306867 [1004] rpexcep-locate: 0 records located for site <NETIQ1046975503_1046975503>
1084306867 [1004] rpjobc-locate: 0 records located for site <NETIQ1046975503_1046975503>
1084306867 [1004]comm-cReposit: 1/1 msgs have been inserted to local repository, dur=16
1084306867 [1004] cmbuf-empty: 1 buffers freed

Following successful insertion to the local repository, the buffer is cleared:

1084306867 [1004] comm-cReposit: 0 msgs in msglist after insertion.
1084306867 [1004] comm-cReposit: leaving...
1084306867 [1004] cmbuf-reset: 0 buffers freed

The cause of the failure to upload can be seen in the following subsequent message in the ccmtrace.log:

1084306870 [852] dispatch-ctrlThreadMain: checking 1 ms comm
1084306870 [852]cmrpcv2-netPing: ms <10.1.1.100> persisent ioc mapfile is full
1084306870 [852] dispatch-ctrlThreadMain: check done


Once the problem subsides, the CCM will retrieve the record(s) from the local repository and will upload, as in the following example:

1084303403 [1004] cmdb-connect: connecting to database <Local-Repository>
1084303403 [1004] cmdb-connect: database <Local-Repository> connected
1084303403 [1004]myBatchLoadTbl: loading records from DataTbl, site=<NETIQ1046975503_1046975503> id=1
1084303403 [1004] rpdata-batchload: 1 records loaded for site <NETIQ1046975503_1046975503>
1084303403 [1004] myBatchLoadTbl: 1 records loaded
1084303403 [1004] mySaveRpKeyInfo: saving 1 rp keys from msgs
1084303403 [1004] cmbuf-getNext: reach the end of list
1084303403 [1004] mySaveRpKeyInfo: 1/1 rpkeys have been saved,
1084303403 [1004]                  0/1 are not from local repository.
1084303403 [1004]                  1 rpkeys are from CMData.
1084303403 [1004]                 index range [0, 0]
1084303403 [1004]comm-cUpload: total 1 msgs have been read from repository for site <NETIQ1046975503_1046975503>, needtosend=1
1084303403 [1004] comm-cUpload: in repository event=0,datah=0,data=1,jobc=0,excep=0,ctrlsysevent=0
1084303403 [1004] cmnet-nSortMsg: 1 out o.
f 1 msgs moved
1084303403 [1004] cmnet-send: calling nSendBatch()
1084303403 [1004] cmnet-sendBatch: max=0, _excpq=0, _jobcq=0, _cesysq=0, _evtq=0, _dhq=0, _ceq=0, _dataq=3202959
1084303403 [1004] cmnet-nSendQue: entering
1084303403 [1004] cmnet-nSendQue: maxbytes=0, total_byte=3202959, total_msg=1 #msg=1 from que <Data>
1084303403 [1004] cmnet-nSendQue: 1/1 msgs read from que <Data>, 0 skip
1084303404 [1004]cmrpcv2-netSendData: 1 sent for ms=<10.1.1.100> site=<NETIQ1046975503_1046975503>
1084303404 [1004] cmnet-nSendQue: 1/1 Data msgs sent to ms 10.1.1.100, nfail=0
1084303404 [1004] cmbuf-freeN: 1 of requested 1 buffers freed

The CCM then deletes the point from the local repository:

1084303404 [1004] comm-cDelReposit: entering...
1084303404 [1004] cmdb-connect: connecting to database <Local-Repository>
1084303404 [1004] cmdb-connect: database <Local-Repository> connected
1084303404 [1004] comm-cDelReposit: removing records from local repository.
1084303404 [1004] comm-cDelReposit: deleting records from CMData
1084303404 [1004] comm-cDelReposit:             index range [0, 0]
1084303404 [1004] rpdata-del: del for site <NETIQ1046975503_1046975503>,rc=1
1084303404 [1004] rpdata-del:              index range [0, 0]
1084303404 [1004] rpdata-locate: 0 records located for site <NETIQ1046975503_1046975503>
1084303404 [1004] comm-cDelReposit: 0 records are currently in CMData.
1084303404 [1004]comm-cDelReposit: 1 records have been removed from local repository.
1084303404 [1004] cmbuf-getNext: end of list

Cause

Typically data is stored within the local repository when the agent cannot communicate to the Management Server in a timely fashion. The MS service may be stopped, backed up with data or communication between the agent and MS may be severed.

Additional Information

Formerly known as NETIQKB39756