Environment
ZENworks Configuration Management 2020
Vertica
Situation
Whilst starting Vertica the server console shows:
host (10.10.10.10) report: Startup Failed, ASR Required
Database zenworks did not start successfully: Something Failed
Using the 'force' option (-F, --force) does not resolve the issue:
/opt/vertica/bin/admintools -t start_db -d zenworks -F
Starting nodes:
v_zenworks_node0001 (10.10.10.10)
Starting Vertica on all nodes. Please wait, database with a large catalog may take a while to initialize.
Node Status: v_zenworks_node0001: (DOWN)
...
Found these errors in the startup.logs on hosts:
host (10.10.10.10) report: Startup Failed, ASR Required
Database zenworks did not start successfully: Something Failed
Error in: /var/opt/novell/log/zenworks/startup.log
"node" : "v_zenworks_node0001",
"stage" : "Startup Failed, ASR Required",
"text" : "Node Dependencies:\n1 - cnt: 95\n\n1 - name: v_zenworks_node0001\nNodes certainly in the cluster:\n\tNode 0(v_zenworks_node0001), epoch 408254\nFilling more nodes to satisfy node dependencies:\nData dependencies fulfilled, remaining nodes LGEs don't matter:\n--",
"timestamp" : "2020-08-20 10:40:28.214"
Messages in: /var/opt/novell/log/zenworks/vertica.log
2020-08-20 10:38:37.006 Spread Service InOrder Queue:0x7f2846599700 [VMPI] <INFO> Cluster is being formed; we are invited
...
2020-08-20 10:38:37.481 Init Session:0x7f2844d96700-a00000008d8501 [Txn] <INFO> Begin Txn: a00000008d8501 'runLoadBalancePolicy'
2020-08-20 10:38:37.481 Init Session:0x7f2845d98700 [Session] <INFO> Load balance request from client address 10.10.10.10 had decision: Classic load balancing considered, but either the policy was NONE or no target was available. Details: [NONE or invalid]
2020-08-20 10:38:37.481 Init Session:0x7f2845d98700 <LOG> @v_zenworks_node0001: 00000/5789: Connection load balance request refused by server
2020-08-20 10:38:37.481 Init Session:0x7f2840d92700 <LOG> @v_zenworks_node0001: 00000/2705: Connection received: host=10.10.10.10 port=56678 (connCnt 8)
2020-08-20 10:38:37.481 Init Session:0x7f2840d92700 <LOG> @v_zenworks_node0001: 00000/5998: Received connection load balance request
2020-08-20 10:38:37.482 Init Session:0x7f2845597700-a00000008d8502 [Txn] <INFO> Begin Txn: a00000008d8502 'runLoadBalancePolicy'
2020-08-20 10:38:37.845 Init Session:0x7f2840d92700 <FATAL> @v_zenworks_node0001: {SessionRun} 57V03/4149: Node startup/recovery in progress. Not yet ready to accept connections
LOCATION: initSession, /data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_2_1-x_grader/build/vertica/Session/ClientSession.cpp:556
host (10.10.10.10) report: Startup Failed, ASR Required
Database zenworks did not start successfully: Something Failed
Using the 'force' option (-F, --force) does not resolve the issue:
/opt/vertica/bin/admintools -t start_db -d zenworks -F
Starting nodes:
v_zenworks_node0001 (10.10.10.10)
Starting Vertica on all nodes. Please wait, database with a large catalog may take a while to initialize.
Node Status: v_zenworks_node0001: (DOWN)
...
Found these errors in the startup.logs on hosts:
host (10.10.10.10) report: Startup Failed, ASR Required
Database zenworks did not start successfully: Something Failed
Error in: /var/opt/novell/log/zenworks/startup.log
"node" : "v_zenworks_node0001",
"stage" : "Startup Failed, ASR Required",
"text" : "Node Dependencies:\n1 - cnt: 95\n\n1 - name: v_zenworks_node0001\nNodes certainly in the cluster:\n\tNode 0(v_zenworks_node0001), epoch 408254\nFilling more nodes to satisfy node dependencies:\nData dependencies fulfilled, remaining nodes LGEs don't matter:\n--",
"timestamp" : "2020-08-20 10:40:28.214"
Messages in: /var/opt/novell/log/zenworks/vertica.log
2020-08-20 10:38:37.006 Spread Service InOrder Queue:0x7f2846599700 [VMPI] <INFO> Cluster is being formed; we are invited
...
2020-08-20 10:38:37.481 Init Session:0x7f2844d96700-a00000008d8501 [Txn] <INFO> Begin Txn: a00000008d8501 'runLoadBalancePolicy'
2020-08-20 10:38:37.481 Init Session:0x7f2845d98700 [Session] <INFO> Load balance request from client address 10.10.10.10 had decision: Classic load balancing considered, but either the policy was NONE or no target was available. Details: [NONE or invalid]
2020-08-20 10:38:37.481 Init Session:0x7f2845d98700 <LOG> @v_zenworks_node0001: 00000/5789: Connection load balance request refused by server
2020-08-20 10:38:37.481 Init Session:0x7f2840d92700 <LOG> @v_zenworks_node0001: 00000/2705: Connection received: host=10.10.10.10 port=56678 (connCnt 8)
2020-08-20 10:38:37.481 Init Session:0x7f2840d92700 <LOG> @v_zenworks_node0001: 00000/5998: Received connection load balance request
2020-08-20 10:38:37.482 Init Session:0x7f2845597700-a00000008d8502 [Txn] <INFO> Begin Txn: a00000008d8502 'runLoadBalancePolicy'
2020-08-20 10:38:37.845 Init Session:0x7f2840d92700 <FATAL> @v_zenworks_node0001: {SessionRun} 57V03/4149: Node startup/recovery in progress. Not yet ready to accept connections
LOCATION: initSession, /data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_2_1-x_grader/build/vertica/Session/ClientSession.cpp:556
Resolution
There may have been an abnormal shutdown which resulted in the LGE (Last Good Epoch) not being written.
To confirm the node(s) in the cluster:
vsql -c "SELECT node_name FROM nodes;"
To check the epoch:
vsql -c "SELECT current_epoch, ahm_epoch, last_good_epoch, refresh_epoch FROM system;"
Example output from checking epoch:
current_epoch | ahm_epoch | last_good_epoch | refresh_epoch
---------------+-----------+-----------------+---------------
148 | 146 | 146 | -1
Another way to check the epoch:
/opt/vertica/bin/admintools -t return_epoch -d zenworks
Resolve the issue by loading the last good epoch:
/opt/vertica/bin/admintools -t restart_db -d zenworks -p <vertica_passwd_zman_srvgc> -e last
To confirm the node(s) in the cluster:
vsql -c "SELECT node_name FROM nodes;"
To check the epoch:
vsql -c "SELECT current_epoch, ahm_epoch, last_good_epoch, refresh_epoch FROM system;"
Example output from checking epoch:
current_epoch | ahm_epoch | last_good_epoch | refresh_epoch
---------------+-----------+-----------------+---------------
148 | 146 | 146 | -1
Another way to check the epoch:
/opt/vertica/bin/admintools -t return_epoch -d zenworks
Resolve the issue by loading the last good epoch:
/opt/vertica/bin/admintools -t restart_db -d zenworks -p <vertica_passwd_zman_srvgc> -e last
Additional Information
Before attempting any operation, first backup the Vertica database:
https://www.novell.com/documentation/zenworks-2020/zen_vertica/data/zen_vertica.html#t4aasjx343xn
Make a copy of the file system and/or take a snapshot (if virtual).
admintools command line options:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/AdminTools/WritingAdministrationToolsScripts.htm
vsql tool:
https://www.novell.com/documentation/zenworks-2020/zen_vertica/data/zen_vertica.html#t4aasjx343xn
Make a copy of the file system and/or take a snapshot (if virtual).
admintools command line options:
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/AdminTools/WritingAdministrationToolsScripts.htm
vsql tool:
Reported to Engineering.