Text search in MongoDB using C#

During a recent prototype development we found out we needed a decent search solution - normally this is right where we would turn to Lucene.Net. Lucene.Net is great, but does have some code overhead associated to it (managing indexes etc.), so the fact that we were already using MongoDB and they had just introduced a beta feature for text search (as of version 2.4) seemed to good to overlook!

Upgrading to the latest version of MongoDB was simple and totally issue free, and the instructions for enabling the feature and setting up the indexes required in http://docs.mongodb.org/manual/core/text-search/ were very clear. All that was required was to start the process with a parameter to enable the text search, and then creating indexes which for my requirements were simple from the console:

db.Activity.ensureIndex(
                           {
                             Title: "text",
                             Description: "text",
                             AlsoKnownAs: "text",
                             Keywords: "text"
                           },
                           {
                             name: "ActivityFullTextIndex"
                           }
                         )

The above creating a text index for my collection (Activity) on the Title, Description, AlsoKnownAs and Keywords properties.

So after the indexes were created using the console, all that remained was actually using the search feature.

There is no direct implementation (yet) in the official C# driverso it requires calling the command directly. Reading the unit tests shows just how easy this is. The search operation in all its glory (not that much glory to be fair):

   1:  public IEnumerable<T> Search<T>(string search) where T : class, new()
   2:  {
   3:      var textSearchCommand = new CommandDocument
   4:          {
   5:              { "text", typeof(T).Name },
   6:              { "search", search }
   7:          };
   8:      var commandResult = _database.RunCommandAs<TextSearchCommandResult<T>>(textSearchCommand);
   9:   
  10:      return commandResult.Ok ? commandResult.Results.OrderBy(t => t.score).Select(t => t.obj) : null;
  11:  }

The command document is created to search the collection identified by the supplied generic type T and is supplied a search term. Using _database (an instance of MongoDatabase) we run the command using RunCommandAs returning the results in TextSearchCommandResult (coming soon). In this prototype code if the command result is Ok we return the result objects – ordered by score – obviously this is passed on to render the search result.

So you are now thinking that TextSearchCommandResult must be really complicated, ‘cos the search bit was a doddle right:

   1:  public class TextSearchCommandResult<T> : CommandResult
   2:  {
   3:      public IEnumerable<TextSearchResult<T>> Results
   4:      {
   5:          get
   6:          {
   7:              var results = this.Response["results"].AsBsonArray.Select(row => row.AsBsonDocument);
   8:              var resultObjects = results.Select(item => item.AsBsonDocument);
   9:   
  10:              return resultObjects.Select(row => BsonSerializer.Deserialize<TextSearchResult<T>>(row));
  11:          }
  12:      }
  13:  }
  14:   
  15:  public class TextSearchResult<T>
  16:  {
  17:      public T obj { get; set; }
  18:      public double score { get; set; }
  19:  }

Wrong. Using the CommandResult base class the heavy lifting is done. All that is done here is to deserialize the identified objects into a simple TextSearchResult wrapper (simply to include the score for ordering – there is more info returned, but this prototype only needed the score).

Pretty quick to get text search up and running. Clearly this is still in beta, and doesn’t have the depth of the Lucene.Net implementation yet. Definitely one to keep an eye on though.

Comments are closed