I am working on a Sitecore project that has products and resources fed from a PIM (Product Information Management) System called inRiver. inRiver sends the CRUD operations for it’s entities (ChannelNodes, Products, Resources, Files, etc.) to Sitecore through an HTTP Handler called inRiverDataManager. I was troubleshooting a piece of code in this inRiverDataManager handler for performance issues. Specifically, the lookup of an inRiver entity in sitecore was taking close to 20 seconds. With 200K+ entities including products and resource (pdfs, images, videos, etc.) mass updates were taking days to complete. We needed to make some optimizations to improve this. I started looking at ways to improve the item lookup time in Sitecore. The Handler uses the fast query syntax using an attribute filter to get items from the Sitecore Master Database.
Now, ‘Sitecore Fast Query’ is supposed to be faster than ‘Sitecore Query’ but it wasn’t fast enough for our purposes. If you look at line #3 in the code snippet below you’ll see the path being built there and it looks like this when its built “fast://*[@_inRiver_Id=’201508′]”. 201508 is the Id of the entity from the inRiver System and the also the Name of the Item in Sitecore. After that seeing that I immediately thought, “aaah I can improve that by just giving it a more specific XPath. I was sure that something like this “fast:/sitecore/content/inRiver/387954//*[@_inRiver_id=’201508′]” would make the lookup faster. After all now it has to look in a sub-tree instead of the entire Sitecore content tree.
The results surprised me; it was actually 2-3 seconds slower than the non-specific XPath (see the Powershell ISE screenshot below). I think I need to do some SQL profiling to understand better as to how fast query translates to SQL. For the sake of completeness, I compared the look up times using an Id with both Plain Sitecore Query and Sitecore Fast Query and the results were as expected there.
In the end I ended up using the Content Search API (solr index) to lookup the Item which brought down the lookup time to be under 2 seconds.
/// This method gets called when inriver tries to update and entity in Sitecore though the HTTP Handler
internal void UpdateEntity(int entityId, int channelId)
string path = "fast://*[@_inRiver_Id='" + entityId + "']";
Stopwatch stopwatch = Stopwatch.StartNew();
Entity entity = RemoteManager.DataService.GetEntity(entityId, LoadLevel.DataOnly);
Utils.LogAction("Got entity with data. Took " + stopwatch.Elapsed, ApiType.InRiver, "UpdateEntity");
stopwatch = Stopwatch.StartNew();
Item items = this.MasterDatabase.SelectItems(path);
Utils.LogAction("Lookup items to update. Took " + stopwatch.Elapsed, ApiType.Sitecore, "UpdateEntity");
/// more irrelevant code below this