accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-2873) Create utility that generates single line tablet information
Date Mon, 09 Jun 2014 13:37:01 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025186#comment-14025186
] 

Sean Busbey commented on ACCUMULO-2873:
---------------------------------------

if we used Avro for this output, it would easily handle the binary/text issue. We could also
use the existing avro-tools utilities to get a textual representation or to do projections
of some subset of data.

Also it'd be super easy then to work with it programmatically.

It'd also leverage Avro's extensive schema evolution support, so changes would be easy to
do.

> Create utility that generates single line tablet information
> ------------------------------------------------------------
>
>                 Key: ACCUMULO-2873
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2873
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: Keith Turner
>              Labels: newbie
>             Fix For: 1.7.0
>
>
> It would be very useful to have a utility that generates single line tablet info.  The
output of this could be fed to sort, awk, grep, etc inorder to answer questions like which
tablets have the most files.
> The output could look something like the following
> {noformat}
> $accumulo admin listTablets --table bigTable3
> #files #walogs #entries #size #status #location #tableid #endrow
> 6 2 40,001 50M ASSIGNED 10.1.9.9 4:9997[abc]  3 admin
> 3 1 50,002 40M ASSIGNED 10.1.9.9 5:9997[abc]  3 helpful
> {noformat}
> All of the information can be obtained by scanning the metadata table and looking into
zookeeper.   Could possibly contact tablet servers to get info about entires in memory.
> The order of the columns in the example above is arbitrary, except for end row.  Maybe
end row column should come last because it can be of arbitrary length.  Also the end row could
contain any character, could look into using a CSV library.   It would be nice to design the
utility so that columns can be added in future versions w/o impacting current scripts that
use the utility.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message