hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pablo Medina <pablomedin...@gmail.com>
Subject Performance penalty: Custom Filter names serialization
Date Tue, 20 Aug 2013 15:56:28 GMT
Hi all,

I'm using custom filters to retrieve filtered data from HBase using the
native api. I noticed that the class full names of those custom filters is
being sent as the bytes representation of the string using
Text.writeString(). This consumes a lot of network bandwidth in my case due
to using 5 custom filters per Get and issuing 1.5 million gets per minute.
I took at look at the code (org.apache.hadoop.hbase.io.HbaseObjectWritable)
and It seems that HBase registers its known classes (Get, Put, etc...) and
associates them with an Integer (CODE_TO_CLASS and CLASS_TO_CODE). That
integer is sent instead of the full class name for those known classes. I
did a test reducing my custom filter class names to 2 or 3 letters and it
improved my performance in 25%.
Is there any way to "register" my custom filter classes to behave the same
as HBase's classes? If not, does it make sense to introduce a change to do
that? Is there any other workaround for this issue?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message