hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ey-Chih chow <eyc...@gmail.com>
Subject Re: improve performance of a MapReduce job with HBase input
Date Fri, 25 May 2012 19:25:37 GMT
Thanks.  This works.

Ey-Chih Chow

On May 25, 2012, at 11:33 AM, Jean-Daniel Cryans wrote:

> TIF should be configured via TableMapReduceUtil.initTableMapperJob
> which takes a Scan object.
> 
> J-D
> 
> On Fri, May 25, 2012 at 11:30 AM, Ey-Chih chow <eychih@gmail.com> wrote:
>> Thanks.  Since we use TableInputFormat in our map/reduce job.  The scan object is
created inside TableInputFormat.  Is there any way to get the scan object to set caching?
>> 
>> Ey-Chih Chow
>> 
>> On May 25, 2012, at 11:24 AM, Alok Kumar wrote:
>> 
>>> Hi,
>>> 
>>> you can make use of 'setCaching' method of your scan object.
>>> 
>>> Eg:
>>> Scan objScan = new Scan();
>>> objScan.setCaching(100); // set it to some integer, as per ur use case.
>>> 
>>> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCaching(int)
>>> 
>>> thanks,
>>> Alok
>>> 
>>> On Fri, May 25, 2012 at 11:33 PM, Ey-Chih chow <eychih@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> We have a MapReduce job of which input data is from HBase.  We would like
>>>> to improve performance of the job.  According to the HBase book, we can do
>>>> that by setting scan caching to a number higher than default.  We use
>>>> TableInputFormat to read data from the job.  I look at the implementation
>>>> of the class.  The class does not set caching when a scan object is
>>>> created.  Is there anybody know how to externally set caching for the scan
>>>> created in TableInputFormat?  Thanks.
>>>> 
>>>> Ey-Chih Chow
>> 


Mime
View raw message