accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Rainer <rainer.pe...@gmail.com>
Subject Re: BatchScanner sort question
Date Fri, 25 Oct 2013 19:20:49 GMT
Thanks John, that does help me a lot


On Fri, Oct 25, 2013 at 7:03 PM, John Vines <vines@apache.org> wrote:

> The batch scanner works by getting batches from all tablets in the scan.
> This will typically result in getting sequential batches that are in
> non-sequential ordering. Because batches are solely based on individual
> key-value pairs, it is possible to get a batch that ends mid-row such that
> the following key is a completely different key, also possibly mid-row. If
> you want to guarantee entire rows, the whole row iterator can be used.
>
> tldr; Option2 is accurate, but you can force Option1 to occur
>
>
> On Fri, Oct 25, 2013 at 12:59 PM, Peter Rainer <rainer.peter@gmail.com>wrote:
>
>> Hi,
>>
>> in the BatchScanner JavaDoc it says "Also only use this *when you do not
>> care about the returned data being in sorted order*.* *If you want to
>> lookup a few ranges and expect those ranges to contain a lot of data, then
>> use the Scanner instead. Also, the Scanner will return data in sorted
>> order, this will not."
>>
>> I'm not a 100% sure how to interpret this, so I was wondering if anyone
>> of you could help me clarify that:
>>
>> *Option 1)*
>> Rows are not sorted, but all Key/Value Pairs with the same Row Key are in
>> sequence
>>
>> Example:
>> Format: Key:CF:CQ:Value
>> A:CF1:CQ1:1
>> A:CF2:CQ2:2
>> C:CF1:CQ1:1
>> B:CF1:CQ1:1
>>
>> *Option2)*
>> Rows are not sorted and not even Key/Value Pairs with the same Row Key
>> are in sequence
>>
>> Example:
>> Format: Key:CF:CQ:Value
>> A:CF1:CQ1:1
>> C:CF1:CQ1:1
>> A:CF2:CQ2:2
>> B:CF1:CQ1:1
>>
>>
>> Thanks,
>> Peter
>>
>>
>

Mime
View raw message