Best Practices Infrastructure

Friday Sitecore Best Practice: When to Switch to Solr

In this video i cover my top three reasons for switching from Lucene to Solr: reaching 50,000 documents in an index and having more than one delivery or processing servers.

 

 


9 comments on “Friday Sitecore Best Practice: When to Switch to Solr
  1. Bill Dinger on said:

    I would argue there is no reason to ever use lucene. Setting up solr in a self hosted environment or tomcat on developer machines ahead of time is fairly trivial. By doing it ahead of when you deploy – instead of after – it’s far easier to move those services out rather than having to deal with “now we have to switch from lucene” migration headaches or a mismash of some environments on solr and some on lucene. Plus, even running on same machine in, say, a tomcat hosted instance it’s STILL faster than lucene.

    • SOLR is built on top of Lucene and adds network traffic to the mix – it is _NOT_ faster than Lucene. It _is_ more scalable and better for reliability. But never faster…

        • The caching layer still requires a network round-trip, even a cached response has to go out to get the 304 response before Sitecore can serve from the client’s cache.

          However; if Solr happens to be installed on the local machine, then that network latency is trending very close to 0, but it’s still overhead. When you move to production good practices would require that you setup a redundant multi-node Solr Cloud cluster and you’ll suddenly find those Solr queries are significantly slower than when you had Solr running local to the dev server.

          • Bill Dinger on said:

            I haven’t used sold cloud before, usually just setup my own master / salves, and as long as you’re in the same datacenter network latency should be on the order of 10ms -12ms. An overhead, sure, but a minor one.

  2. Lucene can handle millions of docs easily… performance isn’t an issue, having a small number of docs would be a reason why _not_ to use SOLR, because the extra overhead of the SOLR server can make queries slower than a local Lucene index. Remember that SOLR is built on the Java Lucene search code. Also that SOLR will _always_ take more time to process a query as it has to make a network trip that Lucene does not. Having said that, I always prefer SOLR as its rare to have a single delivery server in any Sitecore implementation.

    • Clarification here; SOLR will not _always_ take more time, but the majority of the time it will. For very large datasets that are sharded across a SOLR Cloud cluster complex queries may come in faster due to parallel processing of the distributed queries. However that’s pretty much a major edge case compared to the vast majority of queries coming from Sitecore.

    • Bill Dinger on said:

      You’re also disregarding that fact that you’re storing lucene indexes on disk on the same server responsible for serving out your content. Those same lucene indexes have to be updated by that delivery server, etc. It’s just more roles on a single server which is usually not a good idea. We want to scale out our roles horizontally, not just add to the load on our content delivery servers. Especially considering the sitecore pricing model we want as little as possible running on a sitecore server to save clients sitecore licensing costs.

      It’s almost like saying “sql should be local to the server to remove the network overhead.” Well, sure ok but…

  3. Art Vandelay on said:

    In light of Sitecore 8.2 now including the Sitecore Solr binaries and necessary global.asax wiring to use Solr (which always seemed to be a config tax for anyone wishing to scale their Sitecore environment), I’m hopeful that Sitecore is only a few releases away from making Solr the *default search provider* for anyone setting up a new Sitecore environment.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.