As many of you know Fishbowl is a Mindbreeze Certified Partner and search appliance reseller. A core component of our company culture is using the same tools and technologies we implement for our customers. For that reason, and to give readers like you a chance to try out Mindbreeze in action, we have implemented Mindbreeze search here on fishbowlsolutions.com. Read on to learn more about the benefits and details of this integration.
Indexing Our Site
The first step in our Mindbreeze integration project was to configure Mindbreeze to crawl our website using the out of the box web crawler. We decided to split the content into two groups, blog posts and everything else, in order to distinctly configure how blog post content would be indexed. Mindbreeze allows the configuration of one or more crawler instances, so we created two crawlers with separate follow and do-not-follow patterns to index each content group.
Next we configured the extraction of content from the site. By default the crawler will crawl the entire contents of a page, but Mindbreeze can optionally restrict content indexing to a specific DIV or section. That way, words contained in your navigation or footer won’t be indexed for every page. For example, Fishbowl’s footer currently includes the word “Mindbreeze”, but when site users search for “Mindbreeze” we don’t want to return every page on the site—only those actually related to Mindbreeze. For customers already leveraging google-on google-off tags for this purpose (a feature from the Google Search Appliance), Mindbreeze can interpret those tags. We have a few spots on our blog where this was used to restrict the indexing of blog sidebars and other non-content elements within a page template.
We also configured metadata extraction from within the blog posts themselves. This was done by telling Mindbreeze (via XPATH selectors) where in the DOM the blog post author, category, and feature image data could be located. Again, this was all accomplished without altering the structure of the site itself or requiring additional work on the part of our site’s contributors. If you have standard htmltags within your pages, Mindbreeze will index these automatically.
As part of the index setup, we configured entity recognition to parse our pages (both blog and non-blog) for the names of the five key technologies Fishbowl works with. This was done using the Mindbreeze entity extraction feature. Each of the five possible values were mapped to a metadata field called Technology. Like the metadata extraction, the entities were extracted without having to change anything about the structure of our site or templates.
Between the time when a user enters their query and the time the search engine computes relevant results, there is a critical piece in the search process often referred to as query expansion. Query expansion describes various ways in which the words the user types can be expanded upon or “understood” by the search engine in order to more accurately represent the original intent and locate the right content. One way queries can be expanded for better search is through the use of synonyms. Synonyms can be used to set related terms equal to one-another, make abbreviations equal to their full meanings, or set legacy terminology as synonymous with current nomenclature. Mindbreeze query expansion is used on this site to expand queries such as “Jobs” to include “Careers” and the legacy product name “UCM” to search for the new name, “WebCenter Content”. Mindbreeze also includes default stemming and spelling expansions to allow users to find content even if their query doesn’t exactly match our site’s data. For example, stemming allows users to search for “orders” and get results containing “order” “ordered” and “ordering.” It means users don’t have to know whether a word was in past tense, plural, or singular, in order to find what they need.
Relevancy and Result Boosting
Relevancy boosting allows administrators to further tune result ranking (also called biasing) based on factors such as metadata values, URL patterns, or date. These relevancy adjustments can be applied to specific sites, so that each audience sees what is most relevant to them. Relevancy is configured through the Mindbreeze Management Center without requiring custom development. On our site, the number of blog posts far outweigh the number of product pages; when someone searches for a product (such as Mindbreeze) we want the first result to be the main Mindbreeze product page. To ensure the main product pages (which may be older and contain fewer words than our latest blog posts) remain on the top, we can use Mindbreeze boosting to either increase the relevancy of product pages or decrease the relevancy of blog posts. All things being equal, it is better to down-boost less relevant content than to up-boost relevant content. We added a rule to reduce the relevancy of all blog post content by a factor of 0.75. We also boost our featured results by a factor of 10 to ensure they appear on top when relevant. In addition to manual tuning, Mindbreeze automatically monitors and analyzes click patterns to learn from user behavior and improve relevancy automatically over time.
Creating the Search Results Page
The search results page used on this site was created using the Mindbreeze Search App Designer. This builder provides a drag-and-drop interface for creating modular, mobile-friendly, search applications. Mindbreeze also provides a JSON API for fully custom search page development.
Our search app combines a list-style results widget and three filter widgets to limit the results based on Technology, Blog Post Category, and Blog Post Author. The filter widgets available within the builder are determined by the metadata available via the indexing configuration described earlier.
To personalize our search app, we made several modifications to the mustache templates which control the rendering of the various widgets. For example, we only show dates on blog posts and include the “blog post” callouts next to blog post titles.
Once the structure of the search app was complete, we were able use the export snippet functionality to copy the search app code from the Mindbreeze Management Center and embed that into a div within our site. In order to make the Mindbreeze search app match the look and feel of the rest of the website, we added a custom CSS file which overrides some of the standard Mindbreeze CSS within the search app.
Search Box Integration & Suggestions
To integrate Mindbreeze with our existing website’s search box, we modified the search input in the site header to direct search form submissions to the new Mindbreeze search results page. Since we are using WordPress, this involved modifying the header.php file within our site’s child theme. We also added a call to the Mindbreeze Suggest API, displayed using jQuery autocomplete, in order to provide search suggestions as you type. Most WCM systems have template files which can be modified to integrate Mindbreeze search into existing site headers. Our customers have similar integrations within Adobe Experience Manger and Oracle WebCenter Portal to name a few.
As a note for those familiar with WordPress, we could have customized the search.php template to include the Mindbreeze Export Snippet instead of creating a new search results page. We wanted to let our contributors edit the heading and call-to-action sections of our search results page without coding, so we built the search results into a standard WordPress page. This also allowed us to keep the core WordPress search page intact for comparison purposes (we are in the search business after all). From a technical perspective, either approach would have worked.
We wanted to share the details about our integration to give anyone using or considering Mindbreeze an in-depth look at a real working search integration. The architecture and approach we took here can be applied to other platforms both internal and externally facing including SharePoint, Oracle WebCenter, or Liferay. Use the search box at the top of the page to try it for yourself. If you have any questions about Mindbreeze search integration options, please contact us.
Time running out on your GSA?
Our expert team knows both GSA and Mindbreeze. We’ll help you understand your options and design a migration plan to fit your needs.