Solr Does Not Know How to Handle Fields in UK Language
This post is from the Fix category and will likely save you hours of work. I was in the middle of writing the first article on the new Architecture series, when I discovered this bug in Sitecore with Solr. In hope to save the rest of the Sitecore community some time and frustration here comes the bug, and of course, the solution.
Recently I was fortunate to spend a good amount of time in the land of Sitecore with Solr. It is always exciting when Sitecore expands its scope to supporting newer platforms, and the addition of Solr to the CMS’s arsenal was definitely a step in the right direction. Solr scales very well, and very easily (check out my Step-by-step Guide on Configuring Solr for Sitecore). In this post I will cover an issue with parsing multilingual fields and provide a solution. The bug currently exist in Sitecore 7.1 and 7.2 (as of this writing – revision 140526); it is likely that earlier versions of Sitecore 7 are also affected by the same bugs.
Solr Bug with Crawling Fields Having UK Versions
After the initial Solr setup all index updates ran fine with no errors. Only recently we discovered an anomaly, where after rebuilding the sitecore_master_core index many items from the website-specific index were be missing.
Naturally, the first thing we did is looked at the logs ; although, manually forced indexes would always complete “successfully”, or so the Indexing Wizard would say, being a long-time Sitecore developer, I have learned to trust, but verify. The Sitecore system logs were clean, however, the Crawling log had many of the same kinds of errors, just for different fields and items:
16920 10:20:43 WARN Crawler : AddRecursive DoItemAdd failed - {C59B1A4B-4465-4271-994F-F3F77BADCDFC} Exception: SolrNet.Exceptions.SolrConnectionException Message: <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"><int name="status">400</int><int name="QTime">158</int></lst><lst name="error"><str name="msg">ERROR: [doc=sitecore://master/{afb44748-7b53-47d0-a101-cf0f15cb59ae}?lang=uk-ua&ver=1] unknown field 'mime_type_t_uk'</str><int name="code">400</int></lst> </response> Source: SolrNet at SolrNet.Impl.SolrConnection.PostStream(String relativeUrl, String contentType, Stream content, IEnumerable`1 parameters) at SolrNet.Impl.SolrConnection.Post(String relativeUrl, String s) at SolrNet.Impl.SolrBasicServer`1.SendAndParseHeader(ISolrCommand cmd) at Sitecore.ContentSearch.SolrProvider.SolrBatchUpdateContext.AddRange(IEnumerable`1 group, Int32 groupSize) at Sitecore.ContentSearch.SolrProvider.SolrBatchUpdateContext.AddDocument(Object itemToAdd, IExecutionContext[] executionContexts) at Sitecore.ContentSearch.SitecoreItemCrawler.DoAdd(IProviderUpdateContext context, SitecoreIndexableItem indexable) at Sitecore.ContentSearch.HierarchicalDataCrawler`1.CrawlItem(Tuple`3 tuple) Nested Exception Exception: System.Net.WebException Message: The remote server returned an error: (400) Bad Request. Source: System at System.Net.HttpWebRequest.GetResponse() at HttpWebAdapters.Adapters.HttpWebRequestAdapter.GetResponse() at SolrNet.Impl.SolrConnection.GetResponse(IHttpWebRequest request) at SolrNet.Impl.SolrConnection.PostStream(String relativeUrl, String contentType, Stream content, IEnumerable`1 parameters)
The common trend among the items we noticed was that they all belonged to the Email Campaign Manager module 2.1 rev. 140214, and versions of items in UK language. This was strange, because, if you were to navigate to those items in Sitecore Content Editor and check available versions and languages, there would not be one for UK (The Queen must have done something to upset Denmark..:))
Something worth noting is that errors that happened during the re-index process were not reported in the UI to users, thus, we can see what kinds of issues and confusion that type of error handling can bring in a production environment. After spending some time troubleshooting, we narrowed the issue down to Solr configuration and contacted Sitecore support.
The only problem for us in this process was that this issue was a road block; we were stuck on the development end, because we could not re-index content. Everytime Solr threw the exception, the index process would exit, leaving many items unindexed. Fortunately, it did not take long to figure out that Sitecore provided a few settings that can be modified to avoid these types of blockers.
By default, Sitecore is configured to stop crawling, if an error occurs. To change that, simply update the following settings in Sitecore.ContentSearch.config.config file to false:
- ContentSearch.Crawling.StopOnCrawlError
- ContentSearch.Crawling.StopOnCrawlFieldError
- ContentSearch.DocumentMapping.StopOnPropertyMappingError
The first two are set to false by default, but the third one is set to true; therefore, if you end up in the same situation where a strange issue is preventing a re-index from completing – set ContentSearch.DocumentMapping.StopOnPropertyMappingError to false and troubleshoot the error later at your own leisure 🙂
Solution: Tell Solr How to Index UK Versioned Text Fields
Before I proceed to the solution, I have to give it to the guys in Ukraine, they have saved me hours, well, by now it is likely – weeks, of work and troubleshooting. Great job, guys, thank you for all your help!
Sitecore Support got back to me the following day proposing to do a couple of tweaks to the date fields, however, the issue persisted. After some more troubleshooting, one of the Support Representatives got back to me with the solution. It turns out that Solr out of the box does not know how to handle _uk fields. Therefore, all we needed to do is add the following line to schema.xml list of field serialization definitions –
<dynamicField name="*_t_uk" type="text_general" indexed="true" stored="true" />
After making this update – the issue was fixed!
Hopefully, this post will save many developers hours or days of troubleshooting, and please share this article, as many more developers will thank you. Also remember to comment if this worked for you, and give back by blogging, tweeting, and posting on Sitecore forums about any bugs and fixes you find yourself!
JonAE
April 6, 2015 at 11:43 amThis saved me! Great post and thanks for sharing!