lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen <>
Subject Re: SOLR Data Locality
Date Fri, 17 Mar 2017 18:40:01 GMT
Imad Qureshi <> wrote:
> I understand that but unfortunately that's not an option right now.
> We already have 16 TB of index in HDFS.
> So let me rephrase this question. How important is data locality for
> SOLR. Is performance impacted if SOLR data is on a remote node?

The short answer is yes, the long answer is

Anecdotally we did some experiments prior to building our multi-TB search setup, where we
compared local SSDs with remote (Isilon) SSDs. That setup was with simple searches and some
faceting. I was a bit surprised that the slowdown was only 3x. I would expect the speed difference
to be even smaller if the underlying storage is slow (spinning disks). Old blog post at

I don't understand the expected gain of adding replicas, if the data are remote. Why can't
the replica Solrs run on the nodes with the data? Do you have very CPU-intensive search?

- Toke Eskildsen

View raw message