hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikael Sitruk <mikael.sit...@gmail.com>
Subject Re: How and where exactly LSM trees are used in HBase?
Date Mon, 09 Dec 2013 13:08:03 GMT
LSM tree are the basis for reducing random I/O which is a huge performance
factor with big data system. A good overview can be found in HBase in
action book, from Lars George.
The basic idea is that you have an in memory structure for the latest
changes and a structure stored on files, The files content is always
ordered by key, and each row the file is jus the row_key, Column family
identifier, column name, timestamp and the value (+ a marker).
When the memory is full, the memory structure is flushed to disk, when
there are a certain amount of files on filesystem the files are merged to
bigger ones, since the files are ordered the merge is very fast, (like
merge in mergesort algo)

On Sun, Dec 8, 2013 at 8:42 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Searching for 'lsm tree hbase' would give you several articles.
> I am in China - the search results are mostly in Chinese.
> You should be able to read this:
> http://stackoverflow.com/questions/13762992/log-structured-merge-tree-in-hbase
> Cheers
> On Wed, Dec 4, 2013 at 6:49 PM, AnilKumar B <akumarb2010@gmail.com> wrote:
> > Hi,
> >
> > We are trying to understand how and where exactly LSM tress are used in
> > HBase. Currently as per our understanding, while flushing memstore to
> Store
> > files and while HFile compaction it is used. And sits on top of HFiles at
> > memstore level.
> >
> > Is this understanding correct. Can you please give more insight on this?
> > How exactly is the merging done?
> >
> > Thanks & Regards,
> > B Anil Kumar.
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message