Pages

Friday, January 8, 2016

Sitecore Search - Part 1 - Introduction

Using a search-based approach instead of native APIs is a great way to improve the experience users have with your Sitecore application, for performance and usability, among other reasons. LINQ to Provider is the search API piece of LINQ to Sitecore that allows you to develop search-centered solutions in Sitecore.

LINQ to Provider uses a LINQ query syntax to abstract your code, and focus, from the provider specific query parser. As a developer, this means flexibility and efficiency, since you can write some search code for a Lucene based search project, and move that code to a SOLR based project with little modification. And you don’t have to construct provider specific queries yourself.

Sitecore built upon the existing methods of the IQueryable interface, and also provided
additional extensions to make it easy enough for anyone who has worked with LINQ to write search code. To help explain their direction with the LINQ to Provider approach, the Sitecore Dev Team wrote a good blog article. Though the article is more focused on PredicateBuilder, it’s still a good read if you want to get a better understanding of their vision.

LINQ to Provider is the newer approach for search solutions in Sitecore powered applications, and since more information online pertains to Lucene, or doesn’t direct you from start to finish, I am going to try to provide more detailed insight into some of the LINQ to Provider components using a SOLR implementation. I won’t go into every specific piece, but I will focus on the primary pieces that will help you understand how to develop with it. Additionally, this article will be the start of a series that will get into more detail, and some slightly more complicated topics, with the aim of assisting you from start to finish with Sitecore ContentSearch.

The Basic Methods

LINQ to Provider implements many of the methods already available in the IQueryable interface and includes extensions to simplify functionality. A listing of the implemented IQueryable methods is available in the “Developer’s Guide to Item Buckets and Search v75” available from the Sitecore SDN, and while the document title might not indicate complete relevance to using and working with the API, it’s worth reading for some additional information about LINQ to Provider.

Some of the methods in the list are more important or notable than others, and because there can still be confusion about when and how to use them, especially ones that seem similar in functionality, I have created the following list to help define and clarify their use.
  • Where: This is the primary method of searching for the occurrence of terms or phrases. Using Where translates to a standard query “q” for SOLR and calculates the rank, or score, of each result.[pre class="brush:csharp;class-name:'inline';gutter:false;toolbar:false;" title="example"]query.Where(i => i.Title == "Test");[/pre]
  • Filter: This is best used for exact matches, or category & tag style queries, aka filtering. Filter translates to a "fq" parameter for SOLR, does not affect result rank, and each filter is cached in SOLR's filterCache.[pre class="brush:csharp;gutter:false;class-name:'inline';toolbar:false;" title="example"]query.Filter(i => i.Category == “Testing”);[/pre]
  • Equals: This creates a text match query and wraps your string inside quotes when multiple terms are used. [pre class="brush:csharp;class-name:'inline';gutter:false;toolbar:false;" title="example - generates ‘_name:Sitecore’ for SOLR"]Item.Name.Equals(“Sitecore”);[/pre]
  • Contains: This translates to a wildcard match for SOLR and wraps your string with asterisks whether it’s a single term or phrase.[pre class="brush:csharp;gutter:false;class-name:'inline';toolbar:false;" title="example - generates ‘_name:*Sitecore*’ for SOLR"]Item.Name.Contains(“Sitecore”);[/pre]
  • Page: This allows you to specify the number of results and the starting index of those results. It’s a shortcut equivalent of using LINQ Skip and Take.[pre class="brush:csharp;gutter:false;class-name:'inline';toolbar:false;" title="example"]query.Page(currentPage, pageSize);[/pre]
  • Boost: This extension can be applied to the string for your search value(s) to give a multiplier of importance to affect score based on that value.[pre class="brush:csharp;gutter:false;class-name:'inline';toolbar:false;" title="example - increases the ranking by 2"]Item.Name.Contains(“Sitecore”).Boost(2f);[/pre]
  • FacetOn: This tells SOLR what fields to provide facets for. This does not actually restrict the results based on a facet value, as you will do that with a Where, or, preferably, a Filter query.[pre class="brush:csharp;gutter:false;class-name:'inline';toolbar:false;" title="example"]query.FacetOn(i => i.Title)
    // or
    query.FacetOn(i => i[“Title”]);[/pre]
  • GetResults: It’s kind of a given, but this one actually executes the query against SOLR and returns the results. It returns a SearchResults object with three properties:
    • Hits: The number of items pertaining to your current result set. This is different from TotalSearchResults as it only pertains to your current page. If your query returns 235 items and your page size is 20, this will be 20.
    • TotalSearchResults: The total number of matching results across all pages. If your query returns 235 items and your page size is 20, this will be 235.
    • Facets: This is a FacetResults object containing all the facet categories based on your FacetOn values. Each Category object includes a Name and Values property.
  • GetFacets: This returns only the facets without the results. This differs from the GetResults().Facets collection, which returns the complete facet list from the query.

Putting it to Use

Talking about methods, functionality, and APIs can only help someone’s understanding so far, so I have assembled the following code sample and steps to provide guidance to getting started with LINQ to Sitecore.

Step 1 - Create your Context

The first step is kind of a two-parter. You must set a search context for creating queries, but you have to define the index you will be querying in order to create the context. You create the search context using the CreateSearchContext method of the ContentSearchManager, and there are a couple ways to use this while defining the index:

The first approach uses the GetIndex method to reference the search index using a string parameter. With this approach, you will always be querying the specified index no matter what database you are browsing in Sitecore. This means that whether you are logged in to Sitecore browsing the Master database, or Web, the index you query will always be "Web" if that's the string you provide:
[pre class="brush:csharp;toolbar:false;"]
using (var context = Sitecore.ContentSearch.ContentSearchManager.GetIndex(“sitecore_web_index”).CreateSearchContext())
{
    /* [search code here] */
}
[/pre] The second approach uses an Item to reference the search index. This method is useful if you want the query to be contextual to the Sitecore Context. This means you will query the Master index if you are logged in and browsing the Master DB, but will query Web if you are not logged in:
[pre class="brush:csharp;toolbar:false;"]
using (var context = Sitecore.ContentSearch.ContentSearchManager.CreateSearchContext(new Sitecore.ContentSearch.SitecoreIndexableItem(Sitecore.Context.Item)))
{
    /* [search code here] */
}
[/pre]

Step 2 - Build your Query

The query is built inside the “using” statement from the search context object, and defines the result item type. All search expressions are created on the search query using a LINQ syntax. The following example searches for items with “search” in the Name, existing in English:
[pre class="brush:csharp;toolbar:false;"]
var query = context.GetQueryable<SearchResultItem>().Where(i => i.Name.Contains("search")).Filter(i => i.Language == "en");
[/pre] One thing you might notice is the use of Where and Filter in this query. I will break down the differences and use cases for each in an upcoming post.

Step 3 – Set your Paging and Order

By default, Sitecore specifies a “rows” parameter of 500 on a query if you don’t specify paging parameters. This tells SOLR to return up to 500 results at a time, which can really impact performance. The Page extension allows you to specify the start page and per-page result count. It’s also worth noting that paging should be done through the query expression and not after getting the results. Adding the paging parameters to the query offloads pagination to SOLR and improves performance. The example below starts with page 1 and returns a max of 20 items:
[pre class="brush:csharp;toolbar:false;"]
query = query.Page(1, 20);
[/pre] For result ordering, Sitecore does not specify a default, and instead returns results from SOLR by the default order of ranking, or scored relevance. This is typically the desired order for actual web-searches, because you want the most relevant results first. However, if you are using search to display product listings, for instance, you might want to order by a title or name field. Use the OrderBy method for ascending sort (A-Z, 1-100), and OrderByDescending for descending sort (Z-A, 100-1):
[pre class="brush:csharp;toolbar:false;"]
query = query.OrderBy(i => i.Name);  // ascending name
query = query.OrderByDescending(i => i.Name);  // descending name
[/pre]

Step 4 – Get and Display Results

Up to this point, all the code above has done is build a query expression. Until GetResults is called, SOLR is not queried. Once GetResults is called, you are working with the SearchResults object, not the query. To access the actual results, you will reference the Hits property of the SearchResults, and the TotalSearchResults property will provide the total count of results. This code gets the results and DataBinds to a repeater:
[pre class="brush:csharp;toolbar:false;"]
var results = query.GetResults();
if (results.Hits.Count() > 0) {
    litSearchResults.Text = results.TotalSearchResults;
    rptSearchResults.DataSource = results.Hits;
    rptSearchResults.DataBind();
}
[/pre]

Putting it all Together

[pre class="brush:csharp;toolbar:false;"]
using (var context = Sitecore.ContentSearch.ContentSearchManager.CreateSearchContext(new Sitecore.ContentSearch.SitecoreIndexableItem(Sitecore.Context.Item))) {
    var query = context.GetQueryable<SearchResultItem>().Where(i => i.Name.Contains("search")).Filter(i => i.Language == "en");
    query = query.Page(1, 20).OrderBy(i => i.Name);  // chaining is supported
    var results = query.GetResults();
    if (results.Hits.Count() > 0) {
        litSearchResults.Text = results.TotalSearchResults;
        rptSearchResults.DataSource = results.Hits;
        rptSearchResults.DataBind();
    }
}
[/pre]

Closing

While there are more complicated techniques, and topics, involving Sitecore's ContentSearch and LINQ to Provider, this information and example is designed to be enough to provide a general understanding and some context of Sitecore's LINQ to Provider approach. Future articles will cover topics like the difference between Where and Filter, Faceting, and complex queries and tips, but the information above should provide enough to get you started.

No comments:

Post a Comment