hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ey-Chih chow <eyc...@gmail.com>
Subject Re: improve performance of a MapReduce job with HBase input
Date Fri, 25 May 2012 18:30:40 GMT
Thanks.  Since we use TableInputFormat in our map/reduce job.  The scan object is created inside
TableInputFormat.  Is there any way to get the scan object to set caching?

Ey-Chih Chow

On May 25, 2012, at 11:24 AM, Alok Kumar wrote:

> Hi,
> you can make use of 'setCaching' method of your scan object.
> Eg:
> Scan objScan = new Scan();
> objScan.setCaching(100); // set it to some integer, as per ur use case.
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCaching(int)
> thanks,
> Alok
> On Fri, May 25, 2012 at 11:33 PM, Ey-Chih chow <eychih@gmail.com> wrote:
>> Hi,
>> We have a MapReduce job of which input data is from HBase.  We would like
>> to improve performance of the job.  According to the HBase book, we can do
>> that by setting scan caching to a number higher than default.  We use
>> TableInputFormat to read data from the job.  I look at the implementation
>> of the class.  The class does not set caching when a scan object is
>> created.  Is there anybody know how to externally set caching for the scan
>> created in TableInputFormat?  Thanks.
>> Ey-Chih Chow

View raw message