accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dylan Hutchison (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3206) New Public API: Approximate data counts of Tablets and Tablet Servers
Date Sun, 05 Oct 2014 16:34:33 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159585#comment-14159585
] 

Dylan Hutchison commented on ACCUMULO-3206:
-------------------------------------------

Agreed.  That's a fast timeline for 1.7; it would be awesome if we could pull it off by the
end of the year.  In that case our graph library can target 1.7.
To summarize, the API function should return a list of the following items
- split point range (= tablet range)
- number of entries
- number of bytes
- tablet server name/IP

I have hacked versions of all but the number of bytes for Accumulo 1.5, and untested for Accumulo
1.6.

> New Public API: Approximate data counts of Tablets and Tablet Servers
> ---------------------------------------------------------------------
>
>                 Key: ACCUMULO-3206
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3206
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client, monitor
>    Affects Versions: 1.6.1
>            Reporter: Dylan Hutchison
>            Priority: Minor
>
> The broader picture is public programmatic access to information in the Accumulo monitor.
 Specifically I'm looking to obtain the number of entries per tablet and per tablet server
for a given table.  The use case is to verify that manually set (or automatically set I suppose)
table splits are effectively dividing Accumulo data among many tablets, that is, verifying
load balancing.
> I wrote Accumulo 1.5 code which uses non-public API to obtain this information in the
same way the Monitor does via TabletStats. The tricky part was cross-referencing the Metadata
table to find the assignment of tablets to tablet servers for a given table.  I rewrote that
code for 1.6, switching the name of the Metadata table to "accumulo.metadata" and other associated
changes, but it would be great to make this part of the public API so that people don't have
to use non-public methods to obtain data that Accumulo has in the Monitor and Metadata table
anyway.
> We could approach this by adding to the TableOperations class or something similar. 
A request could go to an Accumulo master which gathers the necessary information from the
tablet servers just as the Monitor does, so that the client does not have to do it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message