Summary
Question
When you lose power in your datacenter or when file system issues occur, the Vertica catalog may reference a data file that is either missing or corrupted. In the situations, the nodes computed recovery epoch may be behind ancient history mark (AHM). When two of the buddy nodes (nodes that share a projection segment) have the above problem, the database expected recovery epoch is less than the AHM epoch. When the expected recovery epoch of a database is less than the AHM epoch, the suggested ASR recovery epoch will be 0 and recovery timestamp will be in the year 1999.
Is advisable to contact Vertica Technical Support before proceeding with any of above steps.
Answer
Step 1: Restore from backup
Ask the customer if they have a recent database backup. Restoring from a backup is the safest option.
If the customer does not have a backup to restore from, when the database comes back up and all the steps below are completed, take a couple of minutes to suggest a backup method, such as using AWS s3.
Step 2: Start database in unsafe mode
Use the following command to start the database in unsafe mode to identify impacted projections.
$ admintools -t start_db -d <db1> -U
This mode should only be used under supervision of a senior technical support engineer to help recover a database that is facing the above issue.
Step 3: Identify AHM and nodes required for data safety
Run the following API’s to identify AHM and the nodes required to meet data safety:
SELECT get_ahm_epoch();
SELECT get_expected_recovery_epoch();
The following is the output of get_ahm_epoch and get_expected_recovery_epoch from my 5 node cluster. The AHM epoch is 150 and the expected recovery epoch is 149. You can see node0001, node0002, and node0003 have a node LGE of 170 and we have to fill in node004 to satisfy node dependency. Node0004 has a node LGE of 149 and node0005 has a node LGE of 100. We need to identify projections that are pulling the node LGE of node0004 and node0005 behind AHM epoch.
dbadmin=> select get_ahm_epoch();
get_ahm_epoch
---------------
150
(1 row)
dbadmin=> select get_expected_recovery_epoch();
INFO 4544: Recovery Epoch Computation:
Node Dependencies:
00011 - cnt: 10
00110 - cnt: 10
01100 - cnt: 10
11000 - cnt: 10
10001 - cnt: 10
11111 - cnt: 10
00001 - name: v_vmart_node0001
00010 - name: v_vmart_node0002
00100 - name: v_vmart_node0003
01000 - name: v_vmart_node0004
10000 - name: v_vmart_node0005
Nodes certainly in the cluster:
Node 0(v_vmart_node0001), epoch 170
Node 1(v_vmart_node0002), epoch 170
Node 2(v_vmart_node0003), epoch 170
Filling more nodes to satisfy node dependencies:
Node 3(v_vmart_node0004), epoch 149
Data dependencies fulfilled, remaining nodes LGEs don't matter:
Node 4(v_vmart_node0005), epoch 100
--
get_expected_recovery_epoch
-----------------------------
149
(1 row)
The nodes identified in the outlined section above are the nodes that are filled in to satisfy node data dependency. We need to be concerned with only these nodes, the other node(s) can be ignored. In our example above we had to only fill in v_vmart_node0004.
Step 4: Identify impacted projections
Run the following query to identify projections that have checkpoint epochs (CPE) lower than the AHM epoch. The following query requires a list of nodes that are identified in step 3. In our example above, this is v_vmart_node0004.
Node LGE is computed as a minimum value of projection CPE on a node.
SELECT e.node_name, t.table_schema, t.table_name, e.projection_schema, e.projection_name, checkpoint_epoch FROM projection_checkpoint_epochs e, projections p, tables t WHERE e.projection_id = p.projection_id and p.anchor_table_id = t.table_id and not (is_temp_table) and is_behind_ahm and e. is_up_to_date and e.node_name in (<list of nodes identified in step 3>);
Step 5: Run abortrecovery
You need to run the following command for each table identified in step 4. When you run abortrecovery, you are updating CPE projection to value of “-1”, which means no recovery will take place when you restart database.
SELECT do_tm_task('abortrecovery','<schema.tbl_name>');
Step 6: Restart database
Stop and restart database from admintools. At this point database will be up, but tables that were marked with abortrecovery will have data inconsistencies between projections. Please follow the next few steps to clean data inconsistencies once the database is started and before opening it to general usage.
Step 7: Clean up
When the database is started, the tables that were marked with abortrecovery will need to be fixed. The DBA will need to determine if the table from the list can be rebuilt (data fully truncated) or if some of the data needs to be salvaged due to critical business needs.
Tables that can be truncated should be truncated using the TRUNCATE TABLE command.
For critical tables that can’t be truncated, please follow step 8.
Step 8: Critical table data salvage
For each business critical table that was marked with abortrecovery, use one of the following options to fix data integrity issues:
Partitioned table:
1. Identify partitions that have a count mismatch between two buddy projections by running the following query.
SELECT partition_key, diff FROM (SELECT a.partition_key,sum(a.ros_row_count - a.deleted_row_count) - sum(b.ros_row_count - b.deleted_row_count) as diff FROM partitions a JOIN partitions b ON a.partition_key = b.partition_key WHERE a.projection_name in (<projection name and buddy projection name>) group by 1 ) as sub WHERE diff <> 0;
2. For partitions with a count mismatch between buddy projections, you can drop the partition and reload, or move the impacted partition to a new table using move_partitions_to_table. Then insert select data from the new table to the original table. Impacted partitions will have some data loss here, but end result is no count mismatch between two buddy projections of a table.
3. Once you have fixed all impacted partitions in a table, the above query should return an empty result set.
4. Run the following query to leave a comment for future reference:
COMMENT ON TABLE <schema_name.table name> is 'abortrecovery was run by <TSE> on <date>';
Non-partitioned table:
- Create a new table and insert select data from the impacted fact table into a new table. The new table may have missing data, but there is no count mismatch between two buddy projections of a table.
- Drop the impacted table and rename the new table to the original name.