Use HAProxy and Docker to load balance requests to Solr and SolrCloud
We use Solr 4.6 with SolrCloud configuration in a production system. You may say: “It is hight time you upgraded to the latest version, isn’t it?”. And you will be absolute right! But currently this is not the case because full reindexing costs! Our SolrCloud consists of one shard and five replicas as seen in the screenshot bellow:
The problem we were facing was that SolrJ library does balance the requests using round robin algorithm when using class LBHttpSolrServer. Furthermore, when a request to a node fails for any reason (timeout, http 403, http 404, etc), it puts the node to a zombie list for 60 seconds. This can become a burden when the SolrCloud has heavy loads and you will start to see in your application logs:
Exception in thread "main" org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request
at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:289)
at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:310)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
So we decided to put a HAProxy instance to load balance the query requests to the SolrCloud using leastconn
(least connections) algorithm. After a lot of search and using a combination of official documentation and articles (link1, link2) and the official HAProxy Docker image we created a container using the following configuration:
global
log 127.0.0.1 local0 notice
maxconn 2000
#user haproxy
#group haproxy
defaults
log global
mode http
option httplog
option dontlognull
retries 3
timeout connect 5000
timeout client 10000
timeout server 10000
listen solrcloud
bind 0.0.0.0:8983
mode http
stats enable
stats uri /haproxy?stats
stats realm Strictly\ Private
stats auth haproxyuser:haproxypassword
balance leastconn
option httpclose
option forwardfor
option httpchk GET /solr/paradise-papers/admin/ping?wt=json
server search1 192.168.1.1:8983 check port 8983 inter 20s fastinter 2s
server search2 192.168.1.2:8983 check port 8983 inter 20s fastinter 2s
server search3 192.168.1.3:8983 check port 8983 inter 20s fastinter 2s
server search4 192.168.1.4:8983 check port 8983 inter 20s fastinter 2s
server search5 192.168.1.5:8983 check port 8983 inter 20s fastinter 2s
Notes on configuration:
- Our HAProxy runs in
192.168.1.10
and all the queries are handled by port8983
. - We can see statistics about the distribution of requests using HAProxy statistics page which is accessible in http://192.168.1.10:8983/haproxy?stats. See:
stats uri /haproxy?stats stats realm Strictly\ Private stats auth haproxyuser:haproxypassword
- We check if every Solr node is alive by pinging it (see:
option httpchk GET /solr/paradise-papers/admin/ping?wt=json
) - Check interval is 20 seconds (see:
inter 20s
)
Now our search queries are handled better and we have already noticed an increase in query performance, faster response times and no downtime! This is a sample output from HAProxy statistics page:
We hope you find this article helpful!
Comments