hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "" <>
Subject Table statistics
Date Tue, 15 Dec 2015 09:39:23 GMT

I was wondering if there is any "recognized" way to obtain table statistics.
Ideally, given a Key range I would like to know the number of distinct rowids, entries and
amount of data (in bytes) in that key range.
I assume that Accumulo holds at least some of this information internally, partly because
I can see some of this
through the monitor, and partly because it must know something about the quantity of data
held in order to be able
to implement the table threshold.

In my case the tables are very static and so the "estimates" that the monitor has are likely
to sufficiently accurate for my purposes.

I have found this link
which describes a process (which I haven't tried yet) to get the number of entries in a range.
Which would probably be sufficient for me and would certainly be a good start.
However it seems to be using internal data structures and non-published APIs, which is less
than ideal.
And it seems to be written against Accumulo version 1.6.

I'm using Accumulo 1.7. Is there anything better than I can do or is it recommended that this
is the way to go?


Please consider the environment before printing this email. This message should be regarded
as confidential. If you have received this email in error please notify the sender and destroy
it immediately. Statements of intent shall only become binding when confirmed in hard copy
by an authorised signatory. The contents of this email may relate to dealings with other companies
under the control of BAE Systems Applied Intelligence Limited, details of which can be found

View raw message