Text search in MongoDB using C#

During a recent prototype development we found out we needed a decent search solution - normally this is right where we would turn to Lucene.Net. Lucene.Net is great, but does have some code overhead associated to it (managing indexes etc.), so the fact that we were already using MongoDB and they had just introduced a beta feature for text search (as of version 2.4) seemed to good to overlook!

Upgrading to the latest version of MongoDB was simple and totally issue free, and the instructions for enabling the feature and setting up the indexes required in http://docs.mongodb.org/manual/core/text-search/ were very clear. All that was required was to start the process with a parameter to enable the text search, and then creating indexes which for my requirements were simple from the console:

db.Activity.ensureIndex(
                           {
                             Title: "text",
                             Description: "text",
                             AlsoKnownAs: "text",
                             Keywords: "text"
                           },
                           {
                             name: "ActivityFullTextIndex"
                           }
                         )

The above creating a text index for my collection (Activity) on the Title, Description, AlsoKnownAs and Keywords properties.

So after the indexes were created using the console, all that remained was actually using the search feature.

There is no direct implementation (yet) in the official C# driverso it requires calling the command directly. Reading the unit tests shows just how easy this is. The search operation in all its glory (not that much glory to be fair):

   1:  public IEnumerable<T> Search<T>(string search) where T : class, new()
   2:  {
   3:      var textSearchCommand = new CommandDocument
   4:          {
   5:              { "text", typeof(T).Name },
   6:              { "search", search }
   7:          };
   8:      var commandResult = _database.RunCommandAs<TextSearchCommandResult<T>>(textSearchCommand);
   9:   
  10:      return commandResult.Ok ? commandResult.Results.OrderBy(t => t.score).Select(t => t.obj) : null;
  11:  }

The command document is created to search the collection identified by the supplied generic type T and is supplied a search term. Using _database (an instance of MongoDatabase) we run the command using RunCommandAs returning the results in TextSearchCommandResult (coming soon). In this prototype code if the command result is Ok we return the result objects – ordered by score – obviously this is passed on to render the search result.

So you are now thinking that TextSearchCommandResult must be really complicated, ‘cos the search bit was a doddle right:

   1:  public class TextSearchCommandResult<T> : CommandResult
   2:  {
   3:      public IEnumerable<TextSearchResult<T>> Results
   4:      {
   5:          get
   6:          {
   7:              var results = this.Response["results"].AsBsonArray.Select(row => row.AsBsonDocument);
   8:              var resultObjects = results.Select(item => item.AsBsonDocument);
   9:   
  10:              return resultObjects.Select(row => BsonSerializer.Deserialize<TextSearchResult<T>>(row));
  11:          }
  12:      }
  13:  }
  14:   
  15:  public class TextSearchResult<T>
  16:  {
  17:      public T obj { get; set; }
  18:      public double score { get; set; }
  19:  }

Wrong. Using the CommandResult base class the heavy lifting is done. All that is done here is to deserialize the identified objects into a simple TextSearchResult wrapper (simply to include the score for ordering – there is more info returned, but this prototype only needed the score).

Pretty quick to get text search up and running. Clearly this is still in beta, and doesn’t have the depth of the Lucene.Net implementation yet. Definitely one to keep an eye on though.

Adoption of test driven development

When promoting TDD to teams of developers or even individuals you often get arguments against adoption – I put this down largely to resistance to change – after all users hate change!

In this sort of situation you can adopt many approaches, from the extremes of yelling “just do it” if you are their manager to to “just walking away” and letting them carry on doing what they are used to if its not your team. I am not proud to say that over the years I have tried both extreme approaches, surprisingly neither of which were terribly effective, and have also tried various flavours of the bits in the middle. The most success in achieving adoption has been when the benefits are made (or become) clear to the developer; I have found that once one team member grok’s it , more often than not they become an evangelist, and before long the whole team see the benefits.

A LOT of the arguments against can be boiled down to “writing tests as well will take me longer to develop” and this is my favourite to dispel! In these cases I like to walk through an example, more often than not a simple conversation will suffice, but on particularly stubborn examples a pair style code walk through can work really well.

I think the key is getting the example context understandable to the team or developer you are working with, for example pick a change they have recently made rather than some ridiculous canned example. As we are talking through I like to try to get the developer to turn his analytical mind on his own development process - it’s something I think as a profession we need to continually do, analyse the “how” you are doing stuff as well as the “what” you are doing.

The true light-bulb moments, when the developer realises that he can do this so much more effectively are often when a user interface is involved. If you can make the developer see that his process is write some new code to make a change, compile, run (possibly through the UI) to the point where his code might get executed, step through in a debugger, rinse and repeat ad nauseam. If you can then make him see that his “run through the UI to the point where his code might get executed” has to be repeated many times and actually takes a significant amount of time – even if it is just two button clicks! Hopefully before you have to say it they will already see that they could optimize out that step!

One of the other things to remember is patience or “baby steps”, don’t push too hard too fast. If they only realise they can be more efficient during development just by writing tests to execute their code, let them practice this – highlight other areas for improvement and benefits they could expect, but remember that if you get too preachy with the gospel according to TDD then there is the definite possibility of finding yourself back at square 1 with resistance to change.

Export to Excel xlsx from ASP.NET MVC

First some background. During a recent project we had built an ASP.NET MVC 3 application that allowed users to display lists of data filtering by search criteria. It was all pretty standard stuff, controller actions taking search parameters, requesting data from a repository and passing this data as model content in ViewResult for display. We had a fair number of these actions defined when the customer requested the capability to download the result lists into an excel format for offline analysis. So we wanted to come up with a solution that re-used the existing actions, with minimal impact.

Firstly we built an ActionResult that would return the model data in an xlsx format. This was actually easier than I expected thanks to the Open XML SDK. The solution was really cheap:

public class DownloadViewAsExcelResult : PartialViewResult
{
public DownloadViewAsExcelResult(string viewName, object model)
{
base.ViewName = viewName;
base.ViewData.Model = model;
}

public override void ExecuteResult(ControllerContext context)
{
StringBuilder builder = new StringBuilder();
StringWriter writer = new StringWriter(builder);

ViewEngineResult result = null;
if (View == null)
{
result = FindView(context);
View = result.View;
}

ViewContext viewContext = new ViewContext(context, View, ViewData, TempData, writer);
View.Render(viewContext, writer);

XDocument format = XDocument.Load(new StringReader(builder.ToString()));
Stream xlsxStream = new SpreadsheetBuilder().FromFormatXml(format);

WriteFile(context.HttpContext, xlsxStream);

if (result != null)
result.ViewEngine.ReleaseView(context, View);
}

private static void WriteFile(HttpContextBase context, Stream content)
{
context.Response.Clear();
context.Response.AddHeader("content-disposition", "attachment;filename=download.xlsx");
context.Response.Charset = "";
context.Response.Cache.SetCacheability(HttpCacheability.NoCache);
context.Response.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
content.CopyTo(context.Response.OutputStream);
context.Response.End();
}
}

It extends the PartialViewResult to allow location of a named partial view; this view renders the model as an xml document structured into a known format so that it can be easilt built into a spreadsheet and returned to the response. The re-use of view engine and razor was pragmatic – seemed overkill to add anything else! The format used in this case was:

<book>
<sheet name="mandatory sheet tab name" header="optional header and footer text">
<row>
<cell>@Html.DisplayFor(m => m.Property)</cell>
</row>
</sheet>
</book>

Really simple – a book element containing sheets, which in turn contain rows with cells of data. With this format the spreadhseet builder just reads the xml writing the output to an Open Xml SDK SpreadsheetDocument like so:

public class SpreadsheetBuilder
{
public Stream FromFormatXml(XDocument format)
{
MemoryStream stream = new MemoryStream();
using (SpreadsheetDocument document = SpreadsheetDocument.Create(stream, SpreadsheetDocumentType.Workbook))
{
WorkbookPart workbookpart = document.AddWorkbookPart();
workbookpart.Workbook = new Workbook();
document.WorkbookPart.Workbook.AppendChild<Sheets>(new Sheets());

var sheets = from element in format.Elements("book").Elements("sheet") select element;
foreach (var element in sheets)
{
AddWorksheet(document, element);
}
}
stream.Position = 0;

return stream;
}

private void AddWorksheet(SpreadsheetDocument document, XElement sheetFormat)
{
SheetData sheetData = BuildSheetData(sheetFormat);

WorksheetPart worksheetPart = document.WorkbookPart.AddNewPart<WorksheetPart>();

worksheetPart.Worksheet = new Worksheet(sheetData);

XAttribute headerAttribute = sheetFormat.Attribute("header");
if (headerAttribute != null)
worksheetPart.Worksheet.AppendChild<HeaderFooter>(CreateHeaderFooter(headerAttribute.Value));

Sheets sheets = document.WorkbookPart.Workbook.Descendants<Sheets>().First();

XAttribute nameAttribute = sheetFormat.Attribute("name");
Sheet sheet = new Sheet()
{
SheetId = (UInt32)(sheets.Count() + 1),
Id = document.WorkbookPart.GetIdOfPart(worksheetPart),
Name = (nameAttribute == null) ? "Sheet " + (sheets.Count() + 1) : nameAttribute.Value
};
sheets.AppendChild(sheet);
}

private HeaderFooter CreateHeaderFooter(string message)
{
HeaderFooter header = new HeaderFooter();
OddHeader oddHeader = new OddHeader();
oddHeader.Text = "&C" + message;
OddFooter oddFooter = new OddFooter();
oddFooter.Text = "&C" + message;

header.AppendChild<OddHeader>(oddHeader);
header.AppendChild<OddFooter>(oddFooter);

return header;
}

private SheetData BuildSheetData(XElement sheetFormat)
{
SheetData sheetData = new SheetData();

int rowIndex = 0;
var rows = from element in sheetFormat.Elements("row") select element;
foreach (var rowElement in rows)
{
rowIndex++;
Row row = new Row() { RowIndex = (UInt32)rowIndex };
var cells = from element in rowElement.Elements("cell") select element;
foreach (var cellElement in cells)
{
Cell c = new Cell { DataType = CellValues.InlineString };
InlineString inlineString = new InlineString();
Text t = new Text { Text = cellElement.Value };
inlineString.AppendChild(t);
c.AppendChild(inlineString);

row.AppendChild(c);
}

sheetData.AppendChild(row);
}

return sheetData;
}
}

Now all that was needed was a mechanism to use this result – for this we chose to add an ActionFilterAttribute to each action supporting excel download. This attribute just checks for the existence of a format value equal to “excel”, replacing the result with an instance of our DownloadViewAsExcelResult with view name changed to read from Export sub folder in views when found.

public class ExcelViewDownloadAttribute : ActionFilterAttribute
{
public string ExportViewName { get; set; }

public override void OnActionExecuted(ActionExecutedContext filterContext)
{
base.OnActionExecuted(filterContext);

object model = filterContext.Controller.ViewData.Model;
if (model == null)
return;

ValueProviderResult value = filterContext.Controller.ValueProvider.GetValue("format");
if (value != null && value.AttemptedValue.Equals("excel", StringComparison.InvariantCultureIgnoreCase))
{
var exportView = GetExportViewName(filterContext.ActionDescriptor.ActionName);
DownloadViewAsExcelResult result = new DownloadViewAsExcelResult(exportView, model);
filterContext.Result = result;
}
}

private string GetExportViewName(string actionName)
{
if (string.IsNullOrEmpty(ExportViewName))
ExportViewName = actionName;

return "Export/" + ExportViewName;
}
}

This attribute is then added (along with export view) to all actions requiring excel download support:

[ExcelViewDownload(ExportViewName = "Index")]

Setting cursor in Bing Maps AJAX control (v7.0)

I have been massively sporadic blogging recently – no excuses – I just have…

I have been using the v7.0 of the bing maps ajax controlon a project and wanted to set the cursor on mouse hover of a Pushpin. The idea was that as we had changed the default marker to a different size, and had a click event to show details we needed some way to give the user feedback that the pin was clickable. Pretty standard stuff. Cool thing is that this version made this a total doddle… Short post then!

Looking at the rendered source the ‘map’ element has a MicrosoftMap class and sets the cursor as style on this element

<div class="MicrosoftMap" style="z-index: 0; overflow-x: hidden; overflow-y: hidden; 
background-color: rgb(255, 245, 242); position: absolute; left: 0px; top: 0px; right: 0px; bottom: 0px;
cursor: url(http://ecn.dev.virtualearth.net/mapcontrol/v7.0/cursors/grab.cur), move; "
>

so adding the two handlers

Microsoft.Maps.Events.addHandler(pin, 'mouseover', pinMouseHover);
Microsoft.Maps.Events.addHandler(pin, 'mouseout', pinMouseHover);

then the handing code including a simple bit of jQuery to select by the class

function pinMouseHover(e) {
var cursor = (e.eventName === 'mouseover') ? 'pointer'
: 'url(http://ecn.dev.virtualearth.net/mapcontrol/v7.0/cursors/grab.cur), move';
setMapCursor(cursor);
}

function setMapCursor(cursor) {
$('.MicrosoftMap').css('cursor', cursor);
}

and it’s sorted. Doddle – told you…

When good software saves you...

I enjoyed reading Phil Haack’s recent post where he basically describes how to avoid the mistakes he has (and more than likely most of us have) made in presentations. I chortled along and ‘tutted at the appropriate points with a sort of smug disconnect. Just a couple of days later I gave a short demo!

It was a fairly informal product walk through, showing a prototype to a potential customer to help draw out some more ideas for the product – the sort of thing you don’t prepare too much for… I fired up the laptop, and helpfully Windows had automatically restarted following update (note to self when you get a new laptop remember to enable the “No auto-restart with logged on users for scheduled automatic update installations” in gpedit.msc). Ok, not too much hassle, just some small talk whilst getting rebooted and getting everything running.

The real squeaky bum moment came when starting up mongod to run the mongoDB database and it failed to start – this was a first – it has been rock solid for me so far. So panic is starting to set in now, people watching my every button press, no backup plan, and my db isn’t starting. Funnily enough no matter how many times I try to type “mongod”, or how hard I hit return it always fails. Deep breath and read the error message:

Thu Apr 28 19:21:13 [initandlisten] db version v1.8.1, pdfile version 4.5
Thu Apr 28 19:21:13 [initandlisten] git version: a429cd4f535b2499cc4130b06ff7c26f41c00f04
Thu Apr 28 19:21:13 [initandlisten] build sys info: windows (5, 1, 2600, 2, 'Service Pack 3') BOOST_
LIB_VERSION=1_35
**************
old lock file: \data\db\mongod.lock. probably means unclean shutdown
recommend removing file and running --repair
see: http://dochub.mongodb.org/core/repair for more information
*************
Thu Apr 28 19:21:13 [initandlisten] exception in initAndListen std::exception: old lock file, termin
ating
Thu Apr 28 19:21:13 dbexit:
Thu Apr 28 19:21:13 [initandlisten] shutdown: going to close listening sockets...
Thu Apr 28 19:21:13 [initandlisten] shutdown: going to flush diaglog...
Thu Apr 28 19:21:13 [initandlisten] shutdown: going to close sockets...
Thu Apr 28 19:21:13 [initandlisten] shutdown: waiting for fs preallocator...
Thu Apr 28 19:21:13 [initandlisten] shutdown: closing all files...
Thu Apr 28 19:21:13 closeAllFiles() finished
Thu Apr 28 19:21:13 dbexit: really exiting now

Now its at these times more than any other where you really appreciate good feedback in your software – I do exactly what it says and I’m back up and running. Just wipe that bead of sweat from my brow and I think I’ve got away with it, demo on. Awesome. Not laughing so hard at Phil’s expense now.

Silverlight issue on Chrome with empty InitParams

Just a quick note to hopefully stop anybody wasting time (like I did) on this.

In a Silverlight application we are developing we have implemented a provider model to get provider specific InitParams values passed in. When implementing a new provider and manually testing using my default browser (which today happened to be Chrome), I was presented with the “To view this content, please install” Silverlight click now to install splash! Bummer.

I knew it was provider specific, and to be honest immediately suspected the InitParams as the provider itself did nothing!

To verify I created a new Silverlight project in VS2010, and added an empty InitParams param element like this:

<param name="InitParams" value="" />

running this up in Chrome showed the issue, run in IE8 and its fine; add a value to the InitParams such as:

<param name="InitParams" value="test=test" />

and it works. I haven’t tried other browsers.

Multiple bindings issue with WCF service

We recently implemented a WCF service in a web application designed for multi tenancy, and when testing outside of the web development server in IIS got a ‘yellow screen of death’ with:

This collection already contains an address with scheme http.  There can be at most one address per scheme in this collection.

In the test IIS platform we had multiple bindings defined to support our multi tenancy approach using sub domain to divide the tenants. These bindings were being presented to the constructor of the ServiceHost implementation (I know its too much info, but we were using an implementation of ServiceHostFactory to create an IoC container aware host, and the only reason i mention is that it made debug of this issue really straight forward) as multiple baseAddresses.

The resolution was fortunately incredibly straight forward, the service model config has a mechanism to filter base addresses by prefix. So it was just a case of configuring an appropriate prefix in the baseAddressPrefixFilters.

Silverlight 3 AG_E_PARSER_BAD_TYPE Error

I was porting some code from a prototype project into the production solution, and frankly it wasn’t going well! Debugging the Silverlight aspect I was seeing an AG_E_PARSER_BAD_TYPE error thrown when one of my view models was being loaded.

I knew it was something I had missed – at least that was something.

I was using Prism formerly the Composite Application Guidanceso the issue was caused when loading from my sub module.

To cut a long (well an hour at least) story short, you need to ensure that any references used by the module are also available to the shell. In my case I was referencing controls toolkit in the module, but hadn’t made sure this was referenced in the shell.

One to remember…

Tests failing with ArgumentNullException on controllerContext

So to set the scene I was slow time migrating an existing ASP.NET MVC 1.0 application to v 2.0 (for no other reason than I wanted to catch up with the new stuff as I haven’t been using MVC in recent projects). After using the automated migration toolwritten by Eilon Lipton ’of off’ the ASP.NET team, and sorting some minor issue,s I found that a few tests that were failing with ArgumentNullException on the ModelValidator controllerContext parameter when attempting to UpdateModel. I knew controllerContext was null because I wasn’t setting it, the problem was why was it required?

 

Model validation is one of the new features in v 2.0, stepping through the MVC source code (peeling through the layers of abstraction) showed that the default model binder abstract base ModelValidator requires the ControllerContext on construction. For some background I found this interesting dissection of the MVC model validation from an architectural perspective.

Clearly there are many potential solutions to this issue to get the tests green. In this case I took the somewhat simple pragmatic approach of just setting the ControllerContext on the Controller.

SOAP requests succeeding and logging an HTTP 400

Had a really annoying issue recently where WCF SOAP requests were returning successfully (HTTP 200) but apparently also logging HTTP 400 “Bad Verb” errors in HTTPERR as this small extract from the log shows:

2010-03-10 09:25:18 127.0.0.1 55897 127.0.0.1 80 - - - 400 - Verb -
2010-03-10 09:25:22 127.0.0.1 55902 127.0.0.1 80 - - - 400 - Verb -
2010-03-10 09:25:29 127.0.0.1 55905 127.0.0.1 80 - - - 400 - Verb -

This issue was happening in a large SOA solution, where each WCF service (hosted in IIS) offered a simple “Heartbeat“ operation for use by a hardware load balancer for health monitoring. It was clear that the monitors were causing the issue (as requests from other clients didn’t exhibit this unusual behaviour), what was less clear was why.

The first step was to try and see what was going on, using Network Monitor I captured a trace to see the activity, an extract from a failing trace shows the SOAP protocol request and response, and then a follow up HTTP response with the 400 exception.

750    23.640625        {TCP:85, IPv4:18}    15:36:42.551    20.20.20.179    20.20.20.34    TCP    TCP:Flags=......S., SrcPort=34678, DstPort=HTTP(80), PayloadLen=0, Seq=179182272, Ack=0, Win=5840 ( Negotiating scale factor 0x0 ) = 5840
751 23.640625 {TCP:85, IPv4:18} 15:36:42.551 20.20.20.34 20.20.20.179 TCP TCP:Flags=...A..S., SrcPort=HTTP(80), DstPort=34678, PayloadLen=0, Seq=776316902, Ack=179182273, Win=16384 ( Negotiated scale factor 0x0 ) = 16384
752 23.640625 {TCP:85, IPv4:18} 15:36:42.551 20.20.20.179 20.20.20.34 TCP TCP:Flags=...A...., SrcPort=34678, DstPort=HTTP(80), PayloadLen=0, Seq=179182273, Ack=776316903, Win=5840 (scale factor 0x0) = 5840
753 23.640625 {HTTP:86, TCP:85, IPv4:18} 15:36:42.551 20.20.20.179 20.20.20.34 SOAP SOAP:xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
754 23.640625 {HTTP:86, TCP:85, IPv4:18} 15:36:42.551 20.20.20.34 20.20.20.179 SOAP SOAP:xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"
755 23.640625 {HTTP:86, TCP:85, IPv4:18} 15:36:42.551 20.20.20.34 20.20.20.179 HTTP HTTP:Response, HTTP/1.1, Status Code = 400, URL: /Application/Service.svc
756 23.640625 {TCP:85, IPv4:18} 15:36:42.551 20.20.20.179 20.20.20.34 TCP TCP:Flags=...A...., SrcPort=34678, DstPort=HTTP(80), PayloadLen=0, Seq=179182711, Ack=776317329, Win=6432 (scale factor 0x0) = 6432
757 23.640625 {TCP:85, IPv4:18} 15:36:42.551 20.20.20.179 20.20.20.34 TCP TCP:Flags=...A...F, SrcPort=34678, DstPort=HTTP(80), PayloadLen=0, Seq=179182711, Ack=776317494, Win=7504 (scale factor 0x0) = 7504
758 23.640625 {TCP:85, IPv4:18} 15:36:42.551 20.20.20.34 20.20.20.179 TCP TCP:Flags=...A...., SrcPort=HTTP(80), DstPort=34678, PayloadLen=0, Seq=776317494, Ack=179182712, Win=65097 (scale factor 0x0) = 65097

Knowing that HTTP.sys parses the request before handing on for processing, in this case by ASP.NET, I though I may get some joy from the ETW built in – a quick hit to google turned up some decent posts about capturing and analysing these traces from the Http.sys team. This didn’t really add a lot, but confirmed that that HTTP.sys was rejecting a request.

The load balancer's monitor was a simple send and receive over TCP, posting a send string and parsing the response to check for valid state. In order to emulate the monitor I needed to get right back to basics, avoiding all the (well appreciated) layers of abstraction and start writing directly against a Socket! A really simple bit of code, it took the send string from the load balancer:

POST /Application/Service.svc HTTP/1.1
Accept-Encoding: gzip,deflate
Content-Type: text/xml;charset=UTF-8
SOAPAction: \"http://www.company.com/product/services/service/0/1/ServiceContract/Heartbeat\"
Host:
Content-Length: 136

<soapenv:Envelope xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\"><soapenv:Header/><soapenv:Body/></soapenv:Envelope>

and just sent it direct to the socket

Socket socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
socket.Connect("hosting_server", 80);
socket.Send(requestBytes);

string response = string.Empty;
byte[] responseBytes = new byte[socket.ReceiveBufferSize];

int i = socket.Receive(responseBytes);
response += Encoding.UTF8.GetString(responseBytes, 0, i);

socket.Close();

Note that the host is actually null in the send string, this is allowed and documented in the RFC for HTTP1.1 section 14.23, although to be honest that was the first thing I tried. So after capturing a valid response from a .NET client that did not exhibit the issue using fiddler, comparing and then scientifically fiddling with a few values to no avail, I actually had to do some reading. The answer was actually in the spec (who would have thought!) - the header field definitions in the RFC for HTTP1.1 section 14.10 describes the Connection header, and the pertinent phrase from that section was:

HTTP/1.1 applications that do not support persistent connections MUST include the "close" connection option in every message.

So the fix was actually ludicrously easy adding “Connection: close” to the header in the load balancer send string. After so much investigation effort I honestly hoped for something a little more dramatic…