Environment
Novell Open Enterprise Server 11 (OES 11) Linux Support Pack
2
Situation
gmetad messages flooding the /var/log/messages file.
Sample Error message:
Mar 16 15:53:18 nts153 /opt/novell/ganglia/monitor/sbin/gmetad[21667]: RRD_update (/var/opt/novell/ganglia/rrds/Grid-Node/nts153.lab.novell.com/cpu_idle.rrd): /var/opt/novell/ganglia/rrds/Grid-Node/nts153.lab.novell.com/cpu_idle.rrd: illegal attempt to update using time 1426546398 when last update time is 1426546398 (minimum one second step)
Sample Error message:
Mar 16 15:53:18 nts153 /opt/novell/ganglia/monitor/sbin/gmetad[21667]: RRD_update (/var/opt/novell/ganglia/rrds/Grid-Node/nts153.lab.novell.com/cpu_idle.rrd): /var/opt/novell/ganglia/rrds/Grid-Node/nts153.lab.novell.com/cpu_idle.rrd: illegal attempt to update using time 1426546398 when last update time is 1426546398 (minimum one second step)
Resolution
Steps:
1. Clean up bad entries in /etc/hosts file and DNS records that may be wrong.
2. a. Stop the gmetad and gmond services on any affected nodes
rcnovell-gmond stop
rcnovell-gmetad stop
b. Remove (delete) the directory structures in /var/opt/novell/ganglia/rrds/*
rm -r Grid-node
rm -r __SummaryInfo
c. Restart services. It will repopulate these directories.
rcnovell-gmond start
rcnovell-gmetad start.
3. Check /var/log/messages for any returning gmetad messages.
1. Clean up bad entries in /etc/hosts file and DNS records that may be wrong.
2. a. Stop the gmetad and gmond services on any affected nodes
rcnovell-gmond stop
rcnovell-gmetad stop
b. Remove (delete) the directory structures in /var/opt/novell/ganglia/rrds/*
rm -r Grid-node
rm -r __SummaryInfo
c. Restart services. It will repopulate these directories.
rcnovell-gmond start
rcnovell-gmetad start.
3. Check /var/log/messages for any returning gmetad messages.
Cause
Reverse DNS entry was wrong for a particular node or
/etc/hosts file contains wrong host <ip address> information for a node.
/etc/hosts file contains wrong host <ip address> information for a node.
Additional Information
Configuring Ganglia on OES Documentation
Troubleshooting:
You can use the "date -d @<timevalue>" command to see what the actual time stamp is and see whether it's in the future or not.
Example: date -d @1426546398
Mon Mar 16 16:53:18 MDT 2015
If the time stamp is identical to last update time, then there is most likely multiple entries for this node or multiple hosts with the same IP address. gmetad receives UDP packets from other hosts and will do a reverse DNS lookup on it. Use the netcat and nslookup commands on the localhost to see if it mentions the host twice in the output. You may redirect this to a file or less to parse.
Example: netcat localhost 8651 | less
Example: nslookup <ip address> // make sure you don't have multiple IP addresses resolving to the same host name.
Troubleshooting:
You can use the "date -d @<timevalue>" command to see what the actual time stamp is and see whether it's in the future or not.
Example: date -d @1426546398
Mon Mar 16 16:53:18 MDT 2015
If the time stamp is identical to last update time, then there is most likely multiple entries for this node or multiple hosts with the same IP address. gmetad receives UDP packets from other hosts and will do a reverse DNS lookup on it. Use the netcat and nslookup commands on the localhost to see if it mentions the host twice in the output. You may redirect this to a file or less to parse.
Example: netcat localhost 8651 | less
Example: nslookup <ip address> // make sure you don't have multiple IP addresses resolving to the same host name.