Use HAProxy and Docker to load balance requests to Solr and SolrCloud

2 minute read

We use Solr 4.6 with SolrCloud configuration in a production system. You may say: “It is hight time you upgraded to the latest version, isn’t it?”. And you will be absolute right! But currently this is not the case because full reindexing costs! Our SolrCloud consists of one shard and five replicas as seen in the screenshot bellow:

SolrCloud with one shard and five replicas

The problem we were facing was that SolrJ library does balance the requests using round robin algorithm when using class LBHttpSolrServer. Furthermore, when a request to a node fails for any reason (timeout, http 403, http 404, etc), it puts the node to a zombie list for 60 seconds. This can become a burden when the SolrCloud has heavy loads and you will start to see in your application logs:

Exception in thread "main" org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request
	at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:289)
	at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:310)
	at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
	at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
	at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)

So we decided to put a HAProxy instance to load balance the query requests to the SolrCloud using leastconn (least connections) algorithm. After a lot of search and using a combination of official documentation and articles (link1, link2) and the official HAProxy Docker image we created a container using the following configuration:

global
    log 127.0.0.1 local0 notice
    maxconn 2000
    #user haproxy
    #group haproxy

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    retries 3
    timeout connect  5000
    timeout client  10000
    timeout server  10000

listen solrcloud
    bind 0.0.0.0:8983
    mode http
    stats enable
    stats uri /haproxy?stats
    stats realm Strictly\ Private
    stats auth haproxyuser:haproxypassword
    balance leastconn
    option httpclose
    option forwardfor
    option httpchk GET /solr/paradise-papers/admin/ping?wt=json
    server search1 192.168.1.1:8983 check port 8983 inter 20s fastinter 2s
    server search2 192.168.1.2:8983 check port 8983 inter 20s fastinter 2s
    server search3 192.168.1.3:8983 check port 8983 inter 20s fastinter 2s
    server search4 192.168.1.4:8983 check port 8983 inter 20s fastinter 2s
    server search5 192.168.1.5:8983 check port 8983 inter 20s fastinter 2s

Notes on configuration:

  • Our HAProxy runs in 192.168.1.10 and all the queries are handled by port 8983.
  • We can see statistics about the distribution of requests using HAProxy statistics page which is accessible in http://192.168.1.10:8983/haproxy?stats. See:
      stats uri /haproxy?stats
      stats realm Strictly\ Private
      stats auth haproxyuser:haproxypassword
    
  • We check if every Solr node is alive by pinging it (see: option httpchk GET /solr/paradise-papers/admin/ping?wt=json)
  • Check interval is 20 seconds (see: inter 20s)

Now our search queries are handled better and we have already noticed an increase in query performance, faster response times and no downtime! This is a sample output from HAProxy statistics page:

HAProxy as a load balancer to Solr and SolrCloud with one shard and five replicas using leastconn algorithm

We hope you find this article helpful!

Comments