incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <amccu...@gmail.com>
Subject Re: Blur on SSDs...
Date Tue, 26 May 2015 14:30:19 GMT
On Fri, May 22, 2015 at 3:33 AM, Ravikumar Govindarajan <
ravikumar.govindarajan@gmail.com> wrote:

> Recently I am trying to consider deploying SSDs on search machines
>
> Each machine runs data-nodes + shard-server and local reads of hadoop are
> leveraged….
>
> SSDs are a great-fit for general lucene/solr kind of setups. But for blur,
> I need some help…
>
> 1. Is it a good idea to consider SSDs, especially when block-cache is
> present?
>

Possibly, I don't have any hard number for this type of setup.  My guess is
that SSDs are only going to help when the blocks for the shard are local
and short circuit reads are enabled.


> 2. Are there any grids running blur on SSDs and how they compare to normal
> HDDs?
>

I haven't run any at scale yet.


> 3. Can we disable block-cache on SSDs, especially when local-reads are
> enabled?
>

I would not recommend disabling the block cache.  However you could likely
lower the size of the cache and reduce the overall memory footprint of Blur.


> 4. Using SSDs, blur/lucene will surely be CPU bound. But I don't know what
> over-heads hadoop local-reads brings to the table…
>

If you are using short circuit reads I have seen performance of local
accesses nearing that of native IO.  However if Blur is making remote HDFS
calls every call is like a cache miss.  One interesting thought would be to
try using the HDFS cache feature that is present in the most recent
versions of HDFS.  I haven't tried it yet but it would be interesting to
try.


>
> Any help is much appreciated because I cannot find any info from web on
> this topic
>
> --
> Ravi
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message