Filr Content Indexing search not finding text at end of large documents

  • 7024863
  • 14-Oct-2020
  • 19-Oct-2020

Environment

Micro Focus Filr

Situation

If Content Indexing is turned on, and very large files are Content Indexed, the latter part of the text is not searchable.

Resolution

WARNING:
Enabling Content Indexing requires additional resources (more CPU, RAM, Disk Space). Micro Focus advises caution when enabling this feature.  The suggested resolution steps below will require even more resources than enabling Content Indexing.

Remove the cache file store
Issue this command at a terminal session (like putty):
rm /vashare/cachefilestore
Adjust Lucene values:
Add the following lines within the /opt/novell/filr/apache-tomcat/webapps/ssf/WEB-INF/classes/config/ssf-ext.properties file:
lucene.max.fieldlength=1000000
lucene.max.booleans=1000000

doc.max.text.extraction.size.threshold=10485760

Additional Information

The default values for the above three parameters are
lucene.max.fieldlength=100000
lucene.max.booleans=100000
doc.max.text.extraction.size.threshold=1048576
Notice the values in the suggested resolution have an added '0' to each number.  If the end of the files are still not findable after following the RESOLUTION steps above, then increase the values even more by adding yet another '0'.