hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lars George <lars.geo...@gmail.com>
Subject Re: HRegion.RegionScanner.nextInternal()
Date Fri, 26 Nov 2010 00:52:12 GMT
Mkay, I will look into it more for the latter. But for the limit this is still confusing to
me as limit == batch and that is in he client side the number of rows. But not the number
of columns. Does that mean if I had 100 columns and set batch to 10 that it would only return
10 rows with 10 columns but not what I would have expected ie. 10 rows with all columns? Is
this implicitly mean batch is also the intra row batch size? 

Lars

On Nov 25, 2010, at 21:53, Ryan Rawson <ryanobjc@gmail.com> wrote:

> limit is for retrieving partial results of a row.  Ie: give me a row
> in chunks.  Filters that want to operate on the entire row cannot be
> used with this mode.  i forget why it's in the loop but there was a
> good reason at the time.
> 
> -ryan
> 
> On Thu, Nov 25, 2010 at 10:51 AM, Lars George <lars.george@gmail.com> wrote:
>> Does hbase-dev still get forwarded? Did you see the below message?
>> 
>> ---------- Forwarded message ----------
>> From: Lars George <lars.george@gmail.com>
>> Date: Tue, Nov 23, 2010 at 4:25 PM
>> Subject: HRegion.RegionScanner.nextInternal()
>> To: hbase-dev@hadoop.apache.org
>> 
>> Hi,
>> 
>> I am officially confused:
>> 
>>          byte [] nextRow;
>>          do {
>>            this.storeHeap.next(results, limit - results.size());
>>            if (limit > 0 && results.size() == limit) {
>>              if (this.filter != null && filter.hasFilterRow()) throw
>> new IncompatibleFilterException(
>>                  "Filter with filterRow(List<KeyValue>) incompatible
>> with scan with limit!");
>>              return true; // we are expecting more yes, but also
>> limited to how many we can return.
>>            }
>>          } while (Bytes.equals(currentRow, nextRow = peekRow()));
>> 
>> This is from the nextInternal() call. Questions:
>> 
>> a) Why is that check for the filter and limit both being set inside the loop?
>> 
>> b) if "limit" is the batch size (which for a Get is "-1", not "1" as I
>> would have thought) then what does that "limit - results.size()"
>> achieve?
>> 
>> I mean, this loops gets all columns for a given row, so batch/limit
>> should not be handled here, right? what if limit were set to "1" by
>> the client? Then even if the Get had 3 columns to retrieve it would
>> not be able to since this limit makes it bail out. So there would be
>> multiple calls to nextInternal() to complete what could be done in one
>> loop?
>> 
>> Eh?
>> 
>> Lars
>> 

Mime
View raw message