hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jl...@streamy.com>
Subject Re: Difference in Scan class behavior in MapReduce
Date Fri, 23 Oct 2009 15:49:52 GMT

1.  This is a known issue and is currently being addressed in HBASE-1829 
(https://issues.apache.org/jira/browse/HBASE-1829).  This is currently 
targeted at 0.21, but feel free to review the current patch and add in 
your comments, if we get a working and tested patch soon then I would 
definitely like to include it in 0.20.2.

2.  I couldn't actually track down the issue where it was fixed, but 
this appears to have been fixed starting in 0.20.1 (by the looks of the 
code).  Please upgrade to 0.20.1, or do an svn checkout of 0.20 branch.


Doug Meil wrote:
> I apologize if this has been brought up before, but the Scan class acts differently in
regular client queries than in MapReduce jobs configured by TableMapReduceUtil.  I'm using
the 0.20.0 release in standalone mode at the moment for a proof of concept.
> 1.  Startrow/Stoprow
>     Scan scan = new Scan( startRow, stopRow );
> The "startrow", "stoprow" arguments don't seem to be honored in a MapReduce jobs and
it turns into a full tablescan.
> 2.  Column selection
> If you use this  instance of Scan...
>     Scan scan = new Scan( startRow, stopRow );
> ... in regular client activity this instance will allow selection of attributes in the
Result.  However, this same instance used in a MapReduce job will produce the following exception:
> Exception in thread "main" java.io.IOException: Expecting at least one column.
>       at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:281)
> The remedy is to call either "addColumn" or "addFamily" on the Scan instance as appropriate,
but it's a little odd that in one use case things will work and in another it will exception.
> Doug Meil
> Director of Engineering
> doug.meil@explorys.net

View raw message