hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lars George <lars.geo...@gmail.com>
Subject Re: startRow and endRow doesn't work when use HBase mapreduce
Date Wed, 23 Dec 2009 11:30:02 GMT
Hi Sandy,

If this indeed wrong then the tests miss it too. I will check into it  
and given the trivial fix we can have this committed into 0.20.3 as  
well as 0.21.

Thank you for testing and reporting!

Lars

On Dec 23, 2009, at 11:57, <Sandy_Yin@trendmicro.com.cn> wrote:

> Hi,
>
>    I tried with HBase 0.20.2 , the startRow works , but the endRow  
> doesn't work . It always scan to the table end.
>
>    I reviewed the source code of HBase and found there is a bug .
>
> \src\java\org\apache\hadoop\hbase\mapreduce\  
> TableInputFormatBase.java Line 301
>
> 298       byte[] splitStart = startRow.length == 0 ||
> 299        Bytes.compareTo(keys.getFirst()[i], startRow) >= 0 ?
> 300           keys.getFirst()[i] : startRow;
> 301      byte[] splitStop = stopRow.length == 0 ||
> 302         Bytes.compareTo(keys.getSecond()[i], stopRow) <= 0 ?
> 303          keys.getSecond()[i] : stopRow;
> 304       InputSplit split = new TableSplit(table.getTableName(),
> 305        splitStart, splitStop, regionLocation);
>
> When the region endkey is empty, the splitStop always is empty , So  
> the endRow doesn't work.
>
>    Will HBase 0.21 fix this bug?
>
> Regards,
> Sandy
>
> -----Original Message-----
> From: Sandy Yin (RD-CN)
> Sent: 2009年12月23日 16:15
> To: 'hbase-user@hadoop.apache.org'
> Subject: RE: startRow and endRow doesn't work when use HBase mapreduce
>
> Hi Lars,
>
>    Many thanks for the information :-).
>    I tested with HBase 0.20.1 , it doesn't work . I will upgrade to  
> 0.20.2 and have another test.
>
> Regards,
> Sandy
>
>
> -----Original Message-----
> From: Lars George [mailto:lars.george@gmail.com]
> Sent: 2009年12月23日 15:59
> To: hbase-user@hadoop.apache.org
> Subject: Re: startRow and endRow doesn't work when use HBase mapreduce
>
> Hi Sandy,
>
> Have a look here: http://issues.apache.org/jira/browse/HBASE-1829
>
> I added tests to check if that all works as advertised and it indeed
> does. But only with the next forthcoming versions I am afraid. With
> the released versions I would have assumed at least the scan works
> fine but still scans the whole table while simply skipping the rows
> outside the given range. Do you see it not working at all?
>
> Lars
>
> On Wed, Dec 23, 2009 at 8:46 AM,  <Sandy_Yin@trendmicro.com.cn> wrote:
>> Hi,
>>
>>
>>
>> The startRow and endRow of Scan doesn't work when use HBase  
>> mapreduce. The job always scans the entire table.
>>
>> Is there any reason for this or I misuse?
>>
>>
>>
>> Example code:
>>
>> Scan scan = new Scan();
>>
>> scan.addFamily(...);
>>
>> scan.setStartRow(startkey);
>>
>> scan.setStopRow(endkey);
>>
>> TableMapReduceUtil.initTableMapperJob(tableName,scan, mapperClass,  
>> ImmutableBytesWritable.class,Put.class, job);
>>
>>
>>
>> Thanks.
>>
>>
>> TREND MICRO EMAIL NOTICE
>> The information contained in this email and any attachments is  
>> confidential and may be subject to copyright or other intellectual  
>> property protection. If you are not the intended recipient, you are  
>> not authorized to use or disclose this information, and we request  
>> that you notify us by reply mail or telephone and delete the  
>> original message from your mail system.
>>
>
> TREND MICRO EMAIL NOTICE
> The information contained in this email and any attachments is  
> confidential and may be subject to copyright or other intellectual  
> property protection. If you are not the intended recipient, you are  
> not authorized to use or disclose this information, and we request  
> that you notify us by reply mail or telephone and delete the  
> original message from your mail system.

Mime
View raw message