accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie Stephens ...@morphism.com>
Subject Re: Scanner.estimatedCount()?
Date Fri, 27 Jun 2014 15:05:39 GMT
Eric,

Thanks.  Yeah, it's pretty easy to sample during ingest.  That's probably
what I'll do.  In the past, I've also done the traditional batch statistics
generation.  Would be easy here with MapReduce+combiner.

--Jamie



On Fri, Jun 27, 2014 at 9:40 AM, Eric Newton <eric.newton@gmail.com> wrote:

> Short answer: no.
>
> Long answer:
>
> You can scan the metadata table for the count/size of the files.
>
> You can query tablet servers for the basic stats of every tablet for a
> given table.  This is used for balancing.
>
> But really you should collect the statistics you want during ingest and
> insert them in another table.
>
> -Eric
>
>
> On Fri, Jun 27, 2014 at 9:42 AM, Jamie Stephens <js@morphism.com> wrote:
>
>> Is there a way to get a quick estimate of the number of keys in a given
>> range?
>>
>> Perhaps more generally, getting an estimate of the amount of work (and
>> even some sort of confidence based on, say, the age of something) to
>> iterate over a range.
>>
>> I'd like to do some query planning, so statistics like these sure would
>> be nice.
>>
>> --Jamie
>>
>>
>

Mime
View raw message