hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Sandy_...@trendmicro.com.cn>
Subject RE: startRow and endRow doesn't work when use HBase mapreduce
Date Wed, 23 Dec 2009 10:57:39 GMT
Hi,

	I tried with HBase 0.20.2 , the startRow works , but the endRow doesn't work . It always
scan to the table end. 

	I reviewed the source code of HBase and found there is a bug .

\src\java\org\apache\hadoop\hbase\mapreduce\ TableInputFormatBase.java Line 301

 298       byte[] splitStart = startRow.length == 0 || 
 299        Bytes.compareTo(keys.getFirst()[i], startRow) >= 0 ? 
 300           keys.getFirst()[i] : startRow;
 301      byte[] splitStop = stopRow.length == 0 || 
 302         Bytes.compareTo(keys.getSecond()[i], stopRow) <= 0 ? 
 303          keys.getSecond()[i] : stopRow;
 304       InputSplit split = new TableSplit(table.getTableName(),
 305        splitStart, splitStop, regionLocation);

When the region endkey is empty, the splitStop always is empty , So the endRow doesn't work.

	Will HBase 0.21 fix this bug?

Regards,
Sandy

-----Original Message-----
From: Sandy Yin (RD-CN) 
Sent: 2009年12月23日 16:15
To: 'hbase-user@hadoop.apache.org'
Subject: RE: startRow and endRow doesn't work when use HBase mapreduce

Hi Lars,

	Many thanks for the information :-).
	I tested with HBase 0.20.1 , it doesn't work . I will upgrade to 0.20.2 and have another
test.

Regards,
Sandy
 

-----Original Message-----
From: Lars George [mailto:lars.george@gmail.com] 
Sent: 2009年12月23日 15:59
To: hbase-user@hadoop.apache.org
Subject: Re: startRow and endRow doesn't work when use HBase mapreduce

Hi Sandy,

Have a look here: http://issues.apache.org/jira/browse/HBASE-1829

I added tests to check if that all works as advertised and it indeed
does. But only with the next forthcoming versions I am afraid. With
the released versions I would have assumed at least the scan works
fine but still scans the whole table while simply skipping the rows
outside the given range. Do you see it not working at all?

Lars

On Wed, Dec 23, 2009 at 8:46 AM,  <Sandy_Yin@trendmicro.com.cn> wrote:
> Hi,
>
>
>
> The startRow and endRow of Scan doesn't work when use HBase mapreduce. The job always
scans the entire table.
>
> Is there any reason for this or I misuse?
>
>
>
> Example code:
>
> Scan scan = new Scan();
>
> scan.addFamily(...);
>
> scan.setStartRow(startkey);
>
> scan.setStopRow(endkey);
>
> TableMapReduceUtil.initTableMapperJob(tableName,scan, mapperClass, ImmutableBytesWritable.class,Put.class,
job);
>
>
>
> Thanks.
>
>
> TREND MICRO EMAIL NOTICE
> The information contained in this email and any attachments is confidential and may be
subject to copyright or other intellectual property protection. If you are not the intended
recipient, you are not authorized to use or disclose this information, and we request that
you notify us by reply mail or telephone and delete the original message from your mail system.
>

TREND MICRO EMAIL NOTICE
The information contained in this email and any attachments is confidential and may be subject
to copyright or other intellectual property protection. If you are not the intended recipient,
you are not authorized to use or disclose this information, and we request that you notify
us by reply mail or telephone and delete the original message from your mail system.

Mime
View raw message