hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jmozah <jmo...@gmail.com>
Subject Re: Using HBase serving to replace memcached
Date Tue, 21 Aug 2012 14:45:48 GMT
> 1. I know very basics of Bloom filters, which is used for detect whether an item is in
a set. How to use Bloom filters in HBase to improve random read performance? Could you show
me an example? Thanks.

This will help omit loading the blocks (thereby saving IO and cache churn) which does not
have the given row.
For more on bloom, see 
1 - https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf
2 - http://www.quora.com/How-are-bloom-filters-used-in-HBase

> 2. "Also more client connections is one more issue that might infest you" -- supposing
I am doing random read from a Hadoop job to access HBase, do you mean using multiple client
connections from the Hadoop job is good or not good? Sorry I am a bit lost. :-)

One Hadoop job doing random reads is perfectly fine.  but , since you said "Handling directly
user traffic"... i assumed you wanted to expose HBase independently to every client request,
thereby having as many connections as the number of simultaneous req..

> 3. "asynchbase will help you" -- does HBase support asynchronous API? Sorry I cannot
find it out. Appreciate if you could point me the APIs you are referring to.

Not the default HTable API.  asynchbase is another client for Hbase. read more about asynchbase
here (https://github.com/stumbleupon/asynchbase) 

View raw message