hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: Beware of PREFIX_TREE block encoding
Date Sun, 20 Oct 2013 05:45:42 GMT
I wanted to try PREFIX_TREE because it is supposed to be fastest on
seek/reseek.


On Sat, Oct 19, 2013 at 9:12 PM, lars hofhansl <larsh@apache.org> wrote:

> I found FAST_DIFF to be the fastest of the block encoders.
> (Prefix tree is in 0.96+ only as far as I know.)
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: Vladimir Rodionov <vladrodionov@gmail.com>
> To: "dev@hbase.apache.org" <dev@hbase.apache.org>; lars hofhansl <
> larsh@apache.org>
> Cc:
> Sent: Saturday, October 19, 2013 9:08 PM
> Subject: Re: Beware of PREFIX_TREE block encoding
>
> *Now, which encoder did you test specifically? I seen a 20-40% slowdown
> when everything is in the blockcache (which is the worst case scenario
> here), certainly not a 10x slowdown.*
>
> I have 1.3M rows (very small - 48 bytes) in a block cache which I read
> sequentially, using encoding NONE, PREFIX_TREE and
> StoreScanner/StoreFileScanner (close to metal - block cache :)
>
> Time to read all 1.3M rows reported in ms.
>
> encoding  = NONE,                scanner = StoreScanner;      time = 300
> ms
> encoding  = PREFIX_TREE,  scanner = StoreScanner;      time = 860  ms
> encoding  = NONE              ,  scanner = StoreFileScanner; time = 52   ms
> encoding  = PREFIX_TREE,  scanner = StoreFileScanner; time = 545 ms
>
> -Vladimir
>
>
>
>
> On Sat, Oct 19, 2013 at 8:50 PM, lars hofhansl <larsh@apache.org> wrote:
>
> > That is (unfortunately) a known issue. The main problem is that HBase
> > expects each KV to be backed by a contiguous byte[]. For any prefix
> > encoding it is thus necessary to rematerialize the KV (i.e. copy all the
> > partial bytes into a new location).
> > That is inefficient. Nobody has taken on to fix this (we're 1/2 there
> with
> > Cells in 0.96, though).
> >
> > There a jiras out there to fix this like HBASE-7320 and more recently
> > HBASE-9794.
> >
> > Now, which encoder did you test specifically? I seen a 20-40% slowdown
> > when everything is in the blockcache (which is the worst case scenario
> > here), certainly not a 10x slowdown.
> >
> > Note that with block encoding the block are stored encoded in the
> > blockcache, so more data fits into the cache, and (obviously) there's
> less
> > IO when the data is not in the cache). So the extra work CPU cycles and
> > memory bandwidth used are offset by that.
> >
> > There're other problems too. I just filed an issue (HBASE-9807) where
> with
> > block encoders we make a copy of the key portion of the KV on each
> reseek,
> > just to compare it the current scan key.
> >
> > -- Lars
> > ________________________________
> > From: Vladimir Rodionov <vrodionov@carrieriq.com>
> > To: "dev@hbase.apache.org" <dev@hbase.apache.org>
> > Sent: Saturday, October 19, 2013 7:34 PM
> > Subject: RE: Beware of PREFIX_TREE block encoding
> >
> >
> > What I wanted to say by this? HBase still does not have block encoding
> > which is optimal for both scan and seek (re-seek).
> > I do not think these goals are mutually exclusive.
> >
> >
> > Best regards,
> > Vladimir Rodionov
> > Principal Platform Engineer
> > Carrier IQ, www.carrieriq.com
> > e-mail: vrodionov@carrieriq.com
> >
> > ________________________________________
> >
> > From: Vladimir Rodionov [vladrodionov@gmail.com]
> > Sent: Saturday, October 19, 2013 7:32 PM
> > To: dev@hbase.apache.org
> > Subject: Beware of PREFIX_TREE block encoding
> >
> > The scan performance is bad. 10 x slower on my tests than for blocks with
> > NONE encoding. I scan data directly from block cache through
> > StoreFileScanner (bypassing all StoreScanner/KeyValueHeap stuff). It
> should
> > be clearly stated  that this encoding degrades overall performance
> > significantly in favor of data size reduction and is suitable only for
> Gets
> > - not for Scans.
> >
> > Best regards,
> > -Vladimir Rodionov
> >
> > -
> >
> > Confidentiality Notice:  The information contained in this message,
> > including any attachments hereto, may be confidential and is intended to
> be
> > read only by the individual or entity to whom this message is addressed.
> If
> > the reader of this message is not the intended recipient or an agent or
> > designee of the intended recipient, please note that any review, use,
> > disclosure or distribution of this message or its attachments, in any
> form,
> > is strictly prohibited.  If you have received this message in error,
> please
> > immediately notify the sender and/or Notifications@carrieriq.com and
> > delete or destroy any copy of this message and its attachments.
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message