hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: HRegion.RegionScanner.nextInternal()
Date Fri, 26 Nov 2010 02:08:17 GMT
No, batch size when limit is set is 1. You get partial results for a route,
then get more from the same row. Then the next row.
On Nov 25, 2010 4:54 PM, "Lars George" <lars.george@gmail.com> wrote:
> Mkay, I will look into it more for the latter. But for the limit this is
still confusing to me as limit == batch and that is in he client side the
number of rows. But not the number of columns. Does that mean if I had 100
columns and set batch to 10 that it would only return 10 rows with 10
columns but not what I would have expected ie. 10 rows with all columns? Is
this implicitly mean batch is also the intra row batch size?
>
> Lars
>
> On Nov 25, 2010, at 21:53, Ryan Rawson <ryanobjc@gmail.com> wrote:
>
>> limit is for retrieving partial results of a row. Ie: give me a row
>> in chunks. Filters that want to operate on the entire row cannot be
>> used with this mode. i forget why it's in the loop but there was a
>> good reason at the time.
>>
>> -ryan
>>
>> On Thu, Nov 25, 2010 at 10:51 AM, Lars George <lars.george@gmail.com>
wrote:
>>> Does hbase-dev still get forwarded? Did you see the below message?
>>>
>>> ---------- Forwarded message ----------
>>> From: Lars George <lars.george@gmail.com>
>>> Date: Tue, Nov 23, 2010 at 4:25 PM
>>> Subject: HRegion.RegionScanner.nextInternal()
>>> To: hbase-dev@hadoop.apache.org
>>>
>>> Hi,
>>>
>>> I am officially confused:
>>>
>>> byte [] nextRow;
>>> do {
>>> this.storeHeap.next(results, limit - results.size());
>>> if (limit > 0 && results.size() == limit) {
>>> if (this.filter != null && filter.hasFilterRow()) throw
>>> new IncompatibleFilterException(
>>> "Filter with filterRow(List<KeyValue>) incompatible
>>> with scan with limit!");
>>> return true; // we are expecting more yes, but also
>>> limited to how many we can return.
>>> }
>>> } while (Bytes.equals(currentRow, nextRow = peekRow()));
>>>
>>> This is from the nextInternal() call. Questions:
>>>
>>> a) Why is that check for the filter and limit both being set inside the
loop?
>>>
>>> b) if "limit" is the batch size (which for a Get is "-1", not "1" as I
>>> would have thought) then what does that "limit - results.size()"
>>> achieve?
>>>
>>> I mean, this loops gets all columns for a given row, so batch/limit
>>> should not be handled here, right? what if limit were set to "1" by
>>> the client? Then even if the Get had 3 columns to retrieve it would
>>> not be able to since this limit makes it bail out. So there would be
>>> multiple calls to nextInternal() to complete what could be done in one
>>> loop?
>>>
>>> Eh?
>>>
>>> Lars
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message