Performing Searches in Retain 3.x

  • 7020607
  • 08-Jan-2015
  • 30-Apr-2018

Environment

Situation

How do I use the Search feature in Retain?

Resolution


This article describes each facet of the Search screen and how it works in Retain 3.x.  We strongly recommend upgrading to the latest release of Retain 3.x to get the most up-to-date fixes to the Search feature.  The Search feature will be undergoing a major overhaul with Retain 4.0, which will be released sometime the first half of 2015.  It will have a Google-like searching mechanism that will be very easy to use and powerful; however, we will keep this current searching mechanism available in 4.0 for those who prefer it due for some reason.  It will be a "Legacy Search" option.

IMPORTANT NOTE:  There is a minor bug in Retain 3.x searches that affects the date scope of your search.  If an item lands near the beginning of the day or end of the day, it is not showing up for that day even though your search range clearly indicates that day.  For example, you can have an item dated 3/5/2015 and the delivered time of the item is 5:07 AM.  A search using 3/5/2015 as the beginning date range does not return this item in the search results; however, if you set the date range back a day to 3/4/2015, you will get the item in the search results and you will see that it was delivered on 3/5/2015 at 5:07 AM.  This issue will not be addressed in 3.x; however, it does not occur in our upcoming 4.0 version.  The workaround is to expand your beginning and ending date ranges on your searches by ONE DAY each way.  If you want to find all items for 3/5/2015, then your beginning range should be 3/4/2015 and your ending range 3/6/2015. 

Performing Searches

One of the most important features of Retain is the ability to search for messages that have been archived and stored within the system.

Searching as a User
The administrator will have full access to all mailboxes and be able to search across the entire post office or domain. However, users by default, have restricted access just to themselves to be able to search or browse for messages. They also will have the ability to forward messages, and print messages; however, exporting, restoring, or searching in other  mailboxes outside their own cannot be done. The administrator can give users access to export, search on others mailboxes, and so forth according to what is needed. This can be done by going to the "Users or Groups" menu in the admin UI and adjusting the settings.

Basic Questions for Searching Messages

  • How do you search for messages within Retain?
  • What specific categories are there to be able to search messages? 
  • What is the difference between searching as an admin in comparison of searching as a regular user?

General Searching Explanation
When a message is archived into Retain, the message goes through a process called indexing. Retain’s main indexer is Lucene, a powerful engine that indexes message content, message metadata, and message attachments so that they can later be searched for and found.  If a message does ''not'' get indexed, then the message cannot be found performing a search.  This does not mean the message has not been archived.  In fact, the message can still be viewed in the Browse section of the Retain mailbox. See, "Where Data Is Stored in Retain".

There are multiple methods in performing a search. The first method is what is called the Quick Find.

Quick Find

The Quick Find is not actually found in the Search tab within the Retain interface. Instead it is found in the Browse section for a user but - unlike everything else on the Browse tab - the Quick Find uses the indexes just like a search does.  This search is limited and can only be done on an individual mailbox. The Quick Find allows a way for a user to perform a search based on subject, content with a message or attachment, sender, recipient, etc.  Simply input text in the field and press Enter or click the magnifying glass to search.

Search Tab
Within the Search interface, an administrator - or user - can enter in a term that will return results. Remember that indexing is important for accurate results when searching. In the Search tab, it will display the type of e-mail, the subject of the e-mail, who the message was from, and the date, which happens to be the delivered date generated by the e-mail system (not Retain).  There are multiple things that can be done with the messages that have returned results such as forwarding, litigation hold, exporting etc., but those features are not covered in this article.

One tip in viewing search messages is to make sure the view is set to a date range to display the messages correctly. See "Viewing Results" later in this article.

So how do you perform a search?

  • Select the term on which you wish to search for (subject, sender(email), sender(display), etc.).
  • Select how you are going to search for it ("contains (exact"), "contains(fuzzy)", etc.).
  • Enter the keywords you'll search on.
  • Select whether to serach all mailboxes or a selected one.
  • Adjust the scope: the item type, item source, and attachment size.
  • Sort the search results if desired (defaults to creation date).
  • Indicate any tags you want to include.
  • Filter search results by any miscellaneous criteria (appointment/task start and end/complete date, item status, litigation hold status, and confidential status).

Then click on

We'll go over each of these sections in more detail, starting with "Core".

Core

Basic Search

The  core section of searching are the terms for which you are searching and the mailbox on which you'll perform this search.  By default, the basic search is displayed. The difference between basic and advanced searches is that basic does not include search parameters - it acts primarily as a Quick Find. Type in the subject, text, sender, etc. in the field and that is it. The advanced search will allow you to search specifically within an area of the message or attachment. For example, entering in a search term for a subject that contains exactly what you are entering in will only display messages that contain that specific term in the subject field. If using a basic search, it could look in other areas as well and it not limited to - like in this example - the subject field.

Advanced Search

You can use up to 6 search terms when running an advanced search. Clicking on the expand button (the plus "+" sign) allows you to search for more than one entry at a time. This can help to increase exact specificity with what you are a looking for. When having more than one search term, '''it searches using the "AND" boolean expression'''; thus, the message ''must'' contain the terms you indicate exactly or nothing will be found. For example, if you search for a subject with "contains(exact)" and the term is "GWAVA", it will pull any and all results with the subject of GWAVA. If you add another search term, "RETAIN" for instance, then the subject ''must'' contain GWAVA '''''and''''' RETAIN. If an item only contains one of those search terms but not the other, it will not be included in the search results. '''Each additional "AND" search will be more specific and limited to the terms you are searching for.'''

Tokenized Search Phrase

The Lucene indexing engine follows Unicode Standard Annex #29. This standard uses many common characters as phrase ending or beginning characters (characters in red in this sample):  '  "  .  :  @  +  -  *  /  ,

These characters will cause the system to read any character separated terms as individually entered items when the search is performed. Individually entered terms are treated as OR searches. For example, using the search term 10/20 will be processed as 10 OR 20. Substituting a space instead of the character will provide a logical AND search; thus, searching for 10 20 will be processed as 10 AND 20 as the search term.

If you need to use a special character (^  \  /  *  . )  to search for them literally, you just type them in as part of your search criteria.  They do not need to be "escaped" like you would do if using regular expressions.

Question: If I need to search for jdoe@xyzcompany.com, Lucene would interpret that as jdoe OR xyzcompany OR com. But if I typed it as a regular expression (jdoe\@xyzcompany\.gov), would Lucene interpret that as a proper e-mail address?

Answer:  No, there is no mechanism currently in Retain for escaping characters; however, the good news is that some limited regular expression usage will be available in Retain 4.0.

Boolean Search Using "OR"

Often users will want to do an OR search. If the message contains a subject of gwava or contains the subject of retain.  Can this be done?

The answer is YES. Instead of adding more search terms, you can do this all in an single search term.  When searching using OR, simply put "||" (2 pipes) in between the terms you are searching.   For example, if you wanted to search for gwava OR retain, you would put in the search terms like this: gwava || retain. Any subject that contain gwava or retain will be displayed. The message may only contain one or the other value.

Question:  Can boolean searches be grouped?

Answer:  No, unfortunately, not in Retain 3.x.  You can use the parenthesis character " ( " or " ) ", but the search engine will take those characters as literal search terms, looking for anything with a parenthesis in it.

Special Characters

Special characters are optional when doing a search in Retain. But they can still be used. One special character has already been used in this module, the double pipe: "<b>||</b>". This is an <b>OR</b> expression to search for content that can contain any of the terms you are searching for. Here are other special characters that can be used in Retain (remember, they are optional):

  • asterisk " * ":  Also known as the wildcard. Although you can just leave the search field blank and search to get the same result as when using the asterisk, Retain still recognizes this as a wildcard.  It can be helpful to search on everything within a particular value. *.pdf, or *.domain.com
  • &&:  This is another boolean expression and it means the same thing as AND. You can simply use this or expand to another search box and it will do the same thing.  If you want to search on more than 6 items, you can use the && in a single search term.


Search Terms

There is a drop down menu that provides the conditions for which you can search.



The search interface will remember the last search criteria used until the search is cleared by hitting the Reset button. By default, subject is the first thing that will be shown. Here is a list of the available search conditions with their descriptions:

  • Subject: This searches for words or characters that are only in the subject.
  • Sender (email): This searches for an e-mail address of the person who sent the e-mail (the "From" or "Sender").
  • Sender (display): This will search on the name that is displayed in the viewing pane for the message. This could be the e-mail address, but usually is the actual name of the sender.
  • Sender Domain: This searches only for the domain (@domain.com). The @ symbol is implied and is not needed to perform the search.
  • Attachment Name: This will search for the attachment name.  You can also include the type if you wish and it will bring up a list of all the types of the particular attachment (.pdf, .docx etc.).
  • Recipient: This searches for the person who has received the message ("To" or "Recipient").  This could be the name of the recipient or his/her e-mail address.
  • Recip. Domain: This searches on the recipient's message for the domain only (@domain.com).
  • Mail Server: This is the server name from which the message was archived. In other words, the name of the system from which the message was archived. The Mail Server, or email system is found in the properties of the message under Additional Properties.
  • Messaging Domain: This is the domain through which the message was transferred before reaching the post office level.  The domain can be found in the properties of the message under Additional Properties.
  • Location: This is the post office from which the message was archived.
  • Internet Header: If the message was received from the internet, it will contain mime.822 attachment. This search feature can allow you to search within the internet header for any content such as addresses, IP addresses, servers, domains, or any other content.
  • Category:  This is the type of message.  If the message has a category, this can be searched. The category can only be searched when archived with the routing properties (see Profile | Miscellaneous within the Retain administration tool to enable that feature).
  • Message Contents: Anything within the message and attachment can be searched via this term.

The next drop-down for the search terms indicates how you want the search performed. The selections are as follows:

  • Contains (exact):  This will allow you to enter in partial words in finding messages; however, the content must contain the terms you are searching for or it will not be able to find anything. This is the most accurate way to perform a search in Retain.
  • Contains (fuzzy):  Similar to exact, it will search a message that contains any terms you are searching for. The difference is, you don’t have to be as exact in your search. Anything that might resemble your keyword will be displayed. For example, I do a search for "GWAVA" that will pull anything and everything that looks like "GWAVA"; even if a text in a message reads: "GWAVARETAIN" (all run together), the fuzzy search will find it. The "Contains(exact)" will not find the "GWAVARETAIN", as it is not exact, but the fuzzy search will as it is more forgiving in its querying.  However, there are some issues with the fuzzy search and the current Lucene implementation was never clearly defined.  The result is that sometimes it can run forever and not return a result.  Some customers prefer not to give users the "fuzzy contains" search option (see our KB, "Removing Search Terms and Conditions From the Retain Web Interface", on how to do this); or, if a search just seems to hang, they change it to Contains(exact).
    • NOTE: One customer is having great results with fuzzy searches and do not have the performance issues.  They have an unusually powerful Retain server, with 24 CPU/cores, 32G RAM.
  • Word starts with:  Phrases or even words that begin with certain characters can be searched. This can be a quick way to type in a few letters to come up with multiple entries that something starts with.  This is different from Contains (fuzzy) and Contains (exact) in that it will only show messages that begin with those letters while the others might show messages that contain those letters anywhere within the word.
  • Ends with:  Similar to Word starts with except at the end - it works the same way.
  • Does not contain:  This is for any item that does not contain the phrase or term you are looking for.

If you have trouble searching in one category or location, try another.

Mailboxes

The second part of the Core section is selecting the mailboxes in which you wish to perform your search. There are only two options:

  • All Mailboxes
  • Currently Selected 

All Mailboxes will obviously search the content in all mailboxes in the system. To select individual mailboxes, click the Select button and select the mailboxes in which you wish to search. Make sure to select the appropriate radio button on the left of the mailbox name for what you need.


Scope
The scope is as its label suggests - it is the scope of the item search.  Do you want only mail items?  Only those that were sent?  You get the idea.

Check the boxes for exactly what you want to search on. For Attachment Size, select in the drop down the size of attachment you are looking for. By default, everything is unchecked, which means that Retain will search on everything.


Sort

The sort will display the results based on the criteria you specify here. You can Sort it by the creation date, delivered date,  subject, domain, sender etc.

Additional Information

This article was originally published in the GWAVA knowledgebase as article ID 2436.