accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: total table rows
Date Mon, 09 Nov 2015 15:25:13 GMT
Yeah, there's no explicit tracking of all rows in Accumulo, you're stuck 
with enumerating them (or explicitly tracking them yourself at ingest time).

The easiest approach you can take is probably using the 
FirstEntryInRowIterator and counting each row on the client-side.

You could do another summation in a second iterator but this is a little 
tricky to get correct. I tried to touch on this a little in a blog 
post[1]. If this is a one-off question you want to answer, doing the 
summation on the client side is likely not to take excessively longer 
than a server-side summation.

[1] 
https://blogs.apache.org/accumulo/entry/thinking_about_reads_over_accumulo

z11373 wrote:
> I want to get total rows of a table (likely has more than 100M rows), I think
> to get that information, Accumulo would have to iterate all rows :-( This
> may not be typical Accumulo scenario.
>
> Is there a more efficient way to get total number of rows in a table?
> When Accumulo iterating those items, does it mean it will pull the data to
> the client? If yes, is there a way to ask it to return just the number,
> since that's the only data I care.
>
> Thanks,
> Z
>
>
>
> --
> View this message in context: http://apache-accumulo.1065345.n5.nabble.com/total-table-rows-tp15484.html
> Sent from the Developers mailing list archive at Nabble.com.

Mime
View raw message