Unexpectedly large amount of tablespace consumed by TRUTH.COMPLIANCE_TEST_RSLT_RULESETS.

  • KM03259937
  • 28-Sep-2018
  • 28-Apr-2021

Summary

A DBA reports that an unexpectedly large amount of tablespace is consumed by the TRUTH.COMPLIANCE_TEST_RSLT_RULESETS table that exists in the Server Automation database.

Question

The data stored in the TRUTH.COMPLIANCE_TEST_RSLT_RULESETS table is an XML representation of the set of rules that make up an audit at the time the audit was executed.  In this topic we will discuss the lifecycle of this data and what to do in the event that this table (of the Server Automation (SA) database) appears to be growing larger than expected.
 

Answer

SA has an audit result garbage collector in the primary infra's spin.  It executes every 60 minutes by default (determined by sysconfig parameter <spin.cronbot.delete_audits.rate>) and it removes all audits older than 180 days (determined by sysconfig parameter <spin.cronbot.delete_audits.cleanup_days>).  Also, the audit result GC will always leave it least one audit result for each audit.  So if you execute an audit 10 times today, then 180 days from now, the 9 oldest audit results will be removed leaving only the most recent audit result.
 
The following query will report the number of audit results that are in need of garbage collection:
 
----
[root@sa1023 ~]# /opt/opsware/support/bin/sql "select count(*) as \"#Audit Results Needing GC\" from compliance_test_results s, (select s2.compliance_test_id, max(compliance_test_result_dt) dt from compliance_test_results s2,compliance_tests st where s2.compliance_test_id = st.compliance_test_id group by s2.compliance_test_id) mx where  s.compliance_test_id = mx.compliance_test_id AND s.compliance_test_result_dt != dt AND (s.compliance_test_id is not null) AND s.compliance_test_result_dt < (sysdate-(select to_number(config_value) from config_params cp, config_keys ck, config_values cv where cp.config_param_id=ck.config_param_id and ck.config_key_id=cv.config_key_id and cp.key_name='spin.cronbot.delete_audits.cleanup_days'))"
----
 
The following query will identify each individual audit result that is in need of garbage collection:
 
----
[root@sa1023 ~]# /opt/opsware/support/bin/sql "select s.compliance_test_result_id, compliance_test_result_name, compliance_test_result_dt from compliance_test_results s, (select s2.compliance_test_id, max(compliance_test_result_dt) dt from compliance_test_results s2,compliance_tests st where s2.compliance_test_id = st.compliance_test_id group by s2.compliance_test_id) mx where  s.compliance_test_id = mx.compliance_test_id AND s.compliance_test_result_dt != dt AND (s.compliance_test_id is not null) AND s.compliance_test_result_dt < (sysdate-(select to_number(config_value) from config_params cp, config_keys ck, config_values cv where cp.config_param_id=ck.config_param_id and ck.config_key_id=cv.config_key_id and cp.key_name='spin.cronbot.delete_audits.cleanup_days')) order by compliance_test_result_dt"
----
 
If there are a significant number of audit results that need to be garbage collected, then there might be a problem with the audit result GC.  The first thing to look for is the string "Delete Old Audit: unable to delete audit result  id:" inside of the "/var/log/opsware/spin/spin.err*" files on the primary core's infra server.
 
The second thing to look for is to make sure that the "Delete old audits" cronjob is running inside of the spin on the primary core's infra.  You can list all the cronjobs of a local spin with the following command:
 
  # curl -s -k -E /var/opt/opsware/crypto/httpsProxy/spin.srv 'https://127.0.0.1:1004/sys/cronbot.py'
 
Below is an example record for the "Delete old audits" cronjob:
 
----
        ID: 8
      Name: Delete old audits
      Func: <function deleteOldAudits at 0x7f5a2eee5f50>
      Args: (None, 'Delete old audits')
     State: run
    Active: 0
 MailAlert: 0
MaxRunTime: 3590
   NumRuns: 0
  Schedule: Every 3600 seconds
  Next Run: 1516968650 (Fri Jan 26 12:10:50 2018)
Start time: 1516965050.15 (Fri Jan 26 11:10:50 2018)
  End time: 1516965050.26 (Fri Jan 26 11:10:50 2018)
  Duration: 0.103950977325
----
 
If there is an error occuring with this "Delete old audits" cron job, it should be displayed here.
 
The following query will report roughly the amount of data taken up by each audit by name:
 
----
[root@sa1023 ~]# /opt/opsware/support/bin/sql "select count(distinct compliance_test_id) \"#AUDIT_RESULTS\", count(*) num_rules, sum(length(text_value)) bytes_used, compliance_test_result_name from compliance_test_results ctr, compliance_test_rslt_rulesets ctrr where ctr.compliance_test_result_id=ctrr.compliance_test_result_id group by compliance_test_result_name order by 3 desc"
 
Query #1 on Facility_id 1 (SA1022):
#AUDIT_RESULTS | NUM_RULES | BYTES_USED | COMPLIANCE_TEST_RESULT_NAME
---------------------------------------------------------------------
1              | 181       | 717665     | ssh compliance check
1              | 110       | 437742     | oracle server
1              | 86        | 341476     | pat test
1              | 83        | 331096     | test
----
 
(Note that system configuration parameters should only be modified under the guidance of Micro Focus support personnel.)