hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Mogenet <adrien.moge...@gmail.com>
Subject Re: How Hbase achieves efficient random access?
Date Mon, 07 Jul 2014 16:28:20 GMT
btw, another worth reading article about block caches:
http://www.n10k.com/blog/blockcache-101/


On Mon, Jul 7, 2014 at 8:26 AM, Vladimir Rodionov <vrodionov@carrieriq.com>
wrote:

> >>
> >>Another issue is that we cache only blocks. So for workloads with random
> reads where the working set of blocks does not fit into the aggregate block
> cache HBase would need to load an entire block for each KV it wants to
> read. For those >>workloads we might want to consider a KV cache. (See also
> Vladimirs BigBase - https://github.com/VladRodionov/bigbase).
> >>
>
> Yes, the upcoming first release of BigBase (later this month) will have
> support for SSD cache in row (KV) cache and block cache. You will be able
> to use efficiently both :
> all server's RAM and available SSD disks (especially useful for those who
> run HBase on AWS EC2: all new instances come, by default, with local SSD
> disks.)
>
> Best regards,
> Vladimir Rodionov
>
> http://www.bigbase.org
> ________________________________________
> From: lars hofhansl [larsh@apache.org]
> Sent: Saturday, July 05, 2014 5:23 AM
> To: user@hbase.apache.org
> Subject: Re: How Hbase achieves efficient random access?
>
> What Ted and Intea said.
>
> Are you asking out of interest or do you see performance issues?
>
> One "issue" is that the KeyValues (KVs) in the blocks is not indexed. KVs
> are variable length and hence once a block is loaded it needs to be
> searched linearly in order to find the KV (or determine its absence).
> It's on my list of things to investigate noting the start offsets of all
> KVs somewhere and hence allow a binary search the KVs.
>
> Since blocks are small (64k by default) it might not make a difference,
> but we should check.
>
> Another issue is that we cache only blocks. So for workloads with random
> reads where the working set of blocks does not fit into the aggregate block
> cache HBase would need to load an entire block for each KV it wants to
> read. For those workloads we might want to consider a KV cache. (See also
> Vladimirs BigBase - https://github.com/VladRodionov/bigbase).
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Ted Yu <yuzhihong@gmail.com>
> To: "user@hbase.apache.org" <user@hbase.apache.org>
> Sent: Friday, July 4, 2014 7:39 AM
> Subject: Re: How Hbase achieves efficient random access?
>
>
> For description of HFile v2, see http://hbase.apache.org/book.html#hfilev2
>
> For block cache, see http://hbase.apache.org/book.html#block.cache
>
> In "HBase In Action", starting page 28, there is description for read path.
>
> Cheers
>
>
>
> On Fri, Jul 4, 2014 at 2:02 AM, Intae Kim <inking007@gmail.com> wrote:
>
> > Except memstore, blockcache, hfile count etc..
> >
> > Simply stated, data are sorted in file called HFile (composed of  blocks)
> > when client try to access data, hbase search proper block in file and
> load
> > block to check if the block has the data.
> >
> > See HFile Format in more details, (meta index, data index ...)
> >
> > Good Luck!!
> >
> >
> > 2014-07-04 17:30 GMT+09:00 Ted Yu <yuzhihong@gmail.com>:
> >
> > > Please take a look at http://hbase.apache.org/book/perf.reading.html
> > >
> > > Cheers
> > >
> > > On Jul 4, 2014, at 12:22 AM, yl wu <wuyl6099@gmail.com> wrote:
> > >
> > > > Hi All,
> > > >
> > > > HBase has sorted and indexed Hfile format, which enables fast lookup.
> > > > I am wondering is there any other feature help Hbase achieve
> efficient
> > > > random access?
> > > > I want to know the whole story, but I can't find any article talks
> > about
> > > > random access in HBase in high level.
> > > >
> > > > Can anyone help me resolve my confusion in this?
> > > >
> > > > Best,
> > > > Yanglin
> > >
> >
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
> immediately notify the sender and/or Notifications@carrieriq.com and
> delete or destroy any copy of this message and its attachments.
>



-- 
Adrien Mogenet
http://www.borntosegfault.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message