hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhruba Borthakur <dhr...@gmail.com>
Subject Re: SILT - nice keyvalue store paper
Date Sat, 22 Oct 2011 08:17:09 GMT
Hi Todd,

Thanks for forwarding this paper. I have been mulling similar ideas in my
head for sometime.

One of the current problems with hbase eating lots of cpu is the fact that
the memstore in a sortedset. Instead, we can make it a hashset so that
lookup and insertions are much faster. At the time of flushing, we can sort
the snapshot memstore and write it out to hdfs. This will decrease latencies
of Puts to a great extent. I will experiment on how this will fare with
real-life workload.

I have also been playing around with a 5 node test cluster that has
flashdrives. The flash drives are mounted as xfs filesystems.
A LookasideCacheFileSystem (http://bit.ly/pnGju0) that is a client side
layered filter driver on top of hdfs. When HBase flushes data to HDFS, it is
cached transparently in the LookasideCacheFileSystem.
The LookasideCacheFileSystem uses the flash drive as a cache. The assumption
here is that recently flushed hfiles are more likely to be accessed than the
data in HFiles that were flushed earlier (not yet messed with major
compactions). I will be measuring the performance benefit of this
configuration.

thanks,
dhruba


On Wed, Oct 12, 2011 at 10:42 PM, Todd Lipcon <todd@cloudera.com> wrote:

> Read this paper last night and thought it had some nice ideas that
> would be applicable to HBase:
>
> http://www.cs.cmu.edu/~dga/papers/silt-sosp2011.pdf
>
> I especially like the highly compact indexing scheme (entropy-coded tries)
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Subscribe to my posts at http://www.facebook.com/dhruba

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message