incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <amccu...@gmail.com>
Subject Re: Blur on SSDs...
Date Mon, 22 Jun 2015 12:44:17 GMT
On Tue, Jun 2, 2015 at 1:07 AM, Ravikumar Govindarajan <
ravikumar.govindarajan@gmail.com> wrote:

> >
> > Is this the code for the legacy short circuit reads or the newer version
> > that uses named pipes?
>
>
> The legacy short-circuit reads is for domain-sockets. They have numerous
> perf-issues as documented here
> https://issues.apache.org/jira/browse/HDFS-347
>
> Mmap APIs are the latest. They are referring it as "zero copy reads" and
> don't suffer any of the problems associated with legacy short-circuit
> reads.
>

Only issue with them is that they don't handle file permissions very well.
I think you have to open the data dirs to permission 777 across your
cluster to allow the SC reads to work in legacy mode.


>
> The only thing I find missing is that "unmap" control of blocks is vested
> with hadoop-client...
>

Yeah "unmapping" is usually a hack because the JVM likes to implement the
unmap in a finial method which is tied to GC.  So you have no control over
when it happens, unless you do what Lucene does and force it to happen.

Aaron


>
> --
> Ravi
>
> On Tue, Jun 2, 2015 at 12:50 AM, Aaron McCurry <amccurry@gmail.com> wrote:
>
> > On Wed, May 27, 2015 at 7:51 AM, Ravikumar Govindarajan <
> > ravikumar.govindarajan@gmail.com> wrote:
> >
> > > I was thinking on how blur can effectively use Mmap short-circuit-reads
> > > from hadoop. It's kind of long but please bear...
> > >
> > > Checked out hadoop-2.3.0 source. I am summarizing logic found in
> > > DFSInputStream, ClientMmap & ClientMmapManager source files...
> > >
> > > 1. New method read(ByteBufferPool bufferPool,
> > >
> > >       int maxLength, EnumSet<ReadOption> opts) is exposed for
> > >
> > >      short-circuit Mmap reads
> > >
> > >
> > > 2. Local-blocks are Mmapped and added to LRU
> > >
> > > 3. A ref-count is maintained for every Mmapped block during reads
> > >
> > > 4. When ref-count drops to zero for the block, it is UnMapped.This
> > happens
> > >   when incoming read-offset jumps to a block other than current block.
> > >
> > > 5. UnMapping actually happens via a separate reaper thread...
> > >
> > > Step 4 is problematic, because we don't want hadoop to control
> > "unmapping"
> > > blocks. Ideally blocks should be unmapped when the original IndexInput
> > and
> > > all clones are closed from blur-side…
> > >
> > > If someone from hadoop community can tell us if such a control is
> > possible,
> > > I feel that we can close any perceived perf-gaps between regular lucene
> > > *MmapDirectory* and blur's *HdfsDirectory*
> > >
> > > It should be very trivial to change HdfsDirectory to use the Mmap read
> > > apis..
> > >
> >
> > Is this the code for the legacy short circuit reads or the newer version
> > that uses named pipes?
> >
> >
> > >
> > > --
> > > Ravi
> > >
> > > On Wed, May 27, 2015 at 12:55 PM, Ravikumar Govindarajan <
> > > ravikumar.govindarajan@gmail.com> wrote:
> > >
> > > > My guess is
> > > >> that SSDs are only going to help when the blocks for the shard are
> > local
> > > >> and short circuit reads are enabled.
> > > >
> > > >
> > > > Yes, it's a good-fit for such a use-case alone…
> > > >
> > > > I would not recommend disabling the block cache.  However you could
> > > likely
> > > >> lower the size of the cache and reduce the overall memory footprint
> of
> > > >> Blur.
> > > >
> > > >
> > > > Fine. Can we also scale down the machine RAM itself? [Ex: Instead of
> > > 128GB
> > > > RAM, we can opt for a 64GB or 32GB RAM slot]
> > > >
> > > >  One interesting thought would be to
> > > >> try using the HDFS cache feature that is present in the most recent
> > > >> versions of HDFS.  I haven't tried it yet but it would be
> interesting
> > to
> > > >> try.
> > > >>
> > > >
> > > > I did try reading the HDFS cache code. Think it was written for
> > > Map-Reduce
> > > > use-case where blocks are loaded in memory [basically "mmap" followed
> > by
> > > > "mlock" on data-nodes] just before computation begins and unloaded
> once
> > > > done.
> > > >
> > > > On the short-circuit reads, I found that HDFS-Client is offering 2
> > > options
> > > > for block-reads
> > > > 1. Domain Socket
> > > > 2. Mmap
> > > >
> > > > I think Mmap is superior and must have the same performance as
> lucene's
> > > > MmapDirectory…
> > > >
> > > > --
> > > > Ravi
> > > >
> > > > On Tue, May 26, 2015 at 8:00 PM, Aaron McCurry <amccurry@gmail.com>
> > > wrote:
> > > >
> > > >> On Fri, May 22, 2015 at 3:33 AM, Ravikumar Govindarajan <
> > > >> ravikumar.govindarajan@gmail.com> wrote:
> > > >>
> > > >> > Recently I am trying to consider deploying SSDs on search machines
> > > >> >
> > > >> > Each machine runs data-nodes + shard-server and local reads of
> > hadoop
> > > >> are
> > > >> > leveraged….
> > > >> >
> > > >> > SSDs are a great-fit for general lucene/solr kind of setups.
But
> for
> > > >> blur,
> > > >> > I need some help…
> > > >> >
> > > >> > 1. Is it a good idea to consider SSDs, especially when block-cache
> > is
> > > >> > present?
> > > >> >
> > > >>
> > > >> Possibly, I don't have any hard number for this type of setup.  My
> > guess
> > > >> is
> > > >> that SSDs are only going to help when the blocks for the shard are
> > local
> > > >> and short circuit reads are enabled.
> > > >>
> > > >>
> > > >> > 2. Are there any grids running blur on SSDs and how they compare
> to
> > > >> normal
> > > >> > HDDs?
> > > >> >
> > > >>
> > > >> I haven't run any at scale yet.
> > > >>
> > > >>
> > > >> > 3. Can we disable block-cache on SSDs, especially when local-reads
> > are
> > > >> > enabled?
> > > >> >
> > > >>
> > > >> I would not recommend disabling the block cache.  However you could
> > > likely
> > > >> lower the size of the cache and reduce the overall memory footprint
> of
> > > >> Blur.
> > > >>
> > > >>
> > > >> > 4. Using SSDs, blur/lucene will surely be CPU bound. But I don't
> > know
> > > >> what
> > > >> > over-heads hadoop local-reads brings to the table…
> > > >> >
> > > >>
> > > >> If you are using short circuit reads I have seen performance of
> local
> > > >> accesses nearing that of native IO.  However if Blur is making
> remote
> > > HDFS
> > > >> calls every call is like a cache miss.  One interesting thought
> would
> > be
> > > >> to
> > > >> try using the HDFS cache feature that is present in the most recent
> > > >> versions of HDFS.  I haven't tried it yet but it would be
> interesting
> > to
> > > >> try.
> > > >>
> > > >>
> > > >> >
> > > >> > Any help is much appreciated because I cannot find any info from
> web
> > > on
> > > >> > this topic
> > > >> >
> > > >> > --
> > > >> > Ravi
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message