hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Federico Gaule <fga...@despegar.com>
Subject Re: Performance penalty: Custom Filter names serialization
Date Tue, 20 Aug 2013 19:31:23 GMT
Hi everyone,

I'm facing the same issue as Pablo. Renaming my classes used in HBase 
context improved network usage more than 20%. It would be really nice to 
have an improvement around this.

On 08/20/2013 01:15 PM, Jean-Marc Spaggiari wrote:
> But even if we are using Protobuf, he is going to face the same issue,
> right?
> We should have a way to send the filter once with a number to say to the
> regions that this filter, moving forward, will be represented by this
> number. There is some risk to re-use a number of a filter already using it,
> but I'm sure we can come with some mechanism to avoid that.
> 2013/8/20 Ted Yu <yuzhihong@gmail.com>
>> Are you using HBase 0.92 or 0.94 ?
>> In 0.95 and later releases, HbaseObjectWritable doesn't exist. Protobuf is
>> used for communication.
>> Cheers
>> On Tue, Aug 20, 2013 at 8:56 AM, Pablo Medina <pablomedina85@gmail.com
>>> wrote:
>>> Hi all,
>>> I'm using custom filters to retrieve filtered data from HBase using the
>>> native api. I noticed that the class full names of those custom filters
>> is
>>> being sent as the bytes representation of the string using
>>> Text.writeString(). This consumes a lot of network bandwidth in my case
>> due
>>> to using 5 custom filters per Get and issuing 1.5 million gets per
>> minute.
>>> I took at look at the code
>> (org.apache.hadoop.hbase.io.HbaseObjectWritable)
>>> and It seems that HBase registers its known classes (Get, Put, etc...)
>> and
>>> associates them with an Integer (CODE_TO_CLASS and CLASS_TO_CODE). That
>>> integer is sent instead of the full class name for those known classes. I
>>> did a test reducing my custom filter class names to 2 or 3 letters and it
>>> improved my performance in 25%.
>>> Is there any way to "register" my custom filter classes to behave the
>> same
>>> as HBase's classes? If not, does it make sense to introduce a change to
>> do
>>> that? Is there any other workaround for this issue?
>>> Thanks!

View raw message