How to Rebuild Indexes

  • 7021034
  • 28-Jun-2017
  • 08-Jul-2019

Environment

Retain 1.0 - 4.0.3.1.
Retain 4.1 and newer has a built in option to rebuild the indexes.  See this link for the latest documentation on how to do that.
You will find the option under the Server Configuration section.

Situation

You cannot find a specific message in the Retain archive in the search tab(s).  Searches don't return the full results that are expected.  Additional problems may include:
  • Cannot search contents of .doc or .pdf files (need to enable indexing of these prior to completing this article)
  • New indexes need to be created for attachment types that were not previously selected for indexing.  Customer wishes to have this to be effective from the beginning.
  • New installation from corrupt configuration.
  • Moved installation without moving the indexes.
  • Missing or corrupt index files.

Resolution

DISCLAIMER:
This knowledgebase (KB) article is provided for informational purposes only and as a courtesy service to you, our customer. GWAVA Technical Support does not have any database administration (DBA) expertise, nor does it provide DBA services or support. GWAVA is not responsible for the results of implementing any of the concepts contained in this KB article. Implementation of any of the concepts suggested in this KB article shall be done entirely at your own and sole risk, and GWAVA does not provide any kind of warranties whatsoever resulting from your decision of implementing any of the KB article’s concepts. It is up to you to do any research and to ensure yourself that any implementation and setup of any of the KB article’s concepts in your database system is correctly and properly executed. It is imperative that you have backups of your database system and storage directory before making any implementation. If you don’t have any DBA expertise, you should consult with a DBA expert before any implementation of the KB article’s concepts.  Under no circumstances, shall GWAVA, or any of its employees, be liable, in contract, tort, delict or otherwise, whether negligence is provable or not, for any direct, indirect, incidental, special, punitive, consequential or other damages, loss, cost or liability whatsoever that would result from or are related to the implementation of any of the concepts suggested in the KB article.

To the extent permitted by applicable law, GWAVA shall not be liable to you for any special, consequential, direct, indirect or similar damages, including any loss of data, arising out from migrating any type of messages, attachments, database, metadata in your Retain system to another server and/or location.

Resolution:
To alleviate the aforementioned symptoms, the simplest answer is to rebuild the index files. This should be used as a last resort after verifying that the indexes are truly malformed, because of the amount of time it could take to complete.  The indexing process will only index a MAXIMUM of 500,000 items per day in Retain 2.x.  This means that - in large systems with several million messages - it could take several days, weeks, or months before the indexing process gets up to date. This limitation does not apply to Retain 3 and later; nevertheless, it still could take days, weeks, or months depending on the number of messages that need to be indexed.

Things to look for or try before starting this process:

  • Make sure the message(s) are within the date range that you've specified.
  • Reset your search so as to clear out anything unexpected. (Primarily an issue with Retain 3 and earlier.  Retain 4 no longer features the reset button.)
  • Verify that Retain was able to archive the messages that you're looking for (they show up in the Browse tab).
  • Verify that the message was indexed.  Currently there is not a KB on the subject, support can help you find out if it has been or not.

The impact to your users is that - until all items have been indexed, their searching ability will be limited to what has been finished by the index process.  I.E. You will not be able to be certain that your results are complete until the process is finished.


Rebuilding the Index files:

In Retain 4.1 and newer check out the Server Configuration section of the Online Documentation.  For previous versions continue on.

1.  Note the Retain storage path.  This is specified in the RetainServer web interface under Server Configuration | Storage.

2. If 4.0.2 or earlier is installed go into Server Configuration | Index and makes sure that Stream Size and File Size are set to -1, so the indexer can index as much of the message as it can.  More details as to what this means can be found in the documentation here and another article here.

3.  Shutdown Retain Tomcat. For Retain 3.x and earlier skip to the Legacy Instructions section for the next step.

4.  Edit the ASConfig.cfg file (default file locations can be found in this link, make sure you make a backup of the ASConfig.cfg file first):

A) Inside the ASConfig.cfg file delete the lines between and including  <activeIndexEngines> and </activeIndexEngines> (between these lines is many lines long)

B) Delete the line beginning with <frontEndSearchEngine

C) IF you are running Retain 4.7 or newer, Change these 2 fields from true to false: 

<initDone>true</initDone>

<initIndex>true</initIndex>

D) Save your changes.

5.  Rename the HPIRemoteConfig.cfg and hpi.keystore files:

Linux: /opt/beginfinite/retain/RetainServer/WEB-INF/solrweb/WEB-INF/cfg/
Windows: C:\Program Files\Beginfinite\Retain\RetainServer\WEB-INF\solrweb\WEB-INF\cfg\

Command Line Example (Linux):
cd
/opt/beginfinite/retain/RetainServer/WEB-INF/solrweb/WEB-INF/cfg/
mv
HPIRemoteConfig.cfg HPIRemoteConfig.cfg.old
mv hpi.keystore hpi.keystore.old

6.  Move everything from the solrhome directory under the storage path to a temporary location (or delete the contents of the solrhome directory, it is safer to rename and then delete it once you know that the reindexing is working properly, however if there are space concerns then deletion will also work).

Command Line Example (Linux):

cd /retain_data/index
(optional) mkdir solrhome.temp
mv solrhome/* solrhome.temp/

7.  Reset the indexed items in the database

To accomplish this, open your database editor (MySQL, MS SQL, or Oracle). You will need to modify the t_message table by changing the f_indexed field value to 1.
Here is the MySQL command, which can be adjusted for MS SQL and Oracle accordingly.  Don't be alarmed if the query doesn't come back immediately.  There could be several million items (fields) to change even for small environments, more so in large environments, which could take a long time.

Example query (MySQL/MSSQL):

update [database name].t_message set f_indexed = 1;

8.  Start Retain Tomcat.

9.  Log into the Retain Web admin interface and start the index migration.

  • Enter Server Configuration
  • Click on the Index tab
  • Click on "Migrate to 4.0 Indexer"

  • Save your changes.
  • Enter Admin login credentials (the original admin is ideal)

  • Save changes again.

The ReIndexing process has now been started. The progress of the index process can be monitored from this page by pressing the "Refresh Index Configuration" button.


Legacy Instructions (3.x and earlier):

The Instructions are the same up until step 4 above:

4.  Rename the index folder in the storage path.

Retain 2.x/3.x:  /[storage directory]/index

5.  Reset the indexed items in the database

To accomplish this, open your database editor (MySQL, MS SQL, or Oracle). You will need to modify the t_message table by changing the f_indexed field value to 0.

Here are the MySQL commands, which can be adjusted for MS SQL and Oracle accordingly.  Don't be alarmed if the query doesn't come back immediately.  There could be several million items (fields) to change even for small environments, more so in large environments, which could take a long time.

Retain 2.x or earlier:

update [*database name].Email set indexed = 0;

Retain 3.x:

update [*database name].t_message set f_indexed = 0;

6.  Start Retain Tomcat.

Indexing should begin after starting Tomcat. You can see if the indexer is running by using the Indexer Status utility under the About menu or you can watch the Indexer log file until it stops updating new items.

Additional Information

This article was originally published in the GWAVA knowledgebase article ID 1142.