accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4500) Implement visibility histograms as a table feature
Date Wed, 19 Oct 2016 06:54:58 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15587868#comment-15587868
] 

Christopher Tubbs commented on ACCUMULO-4500:
---------------------------------------------

I would like to suggest we avoid thinking about this feature in terms of histograms. Histograms
are only one possible feature of what was discussed on the mailing list thread. Granted, it
was the feature which started the conversation, but the concepts, APIs, and features, we'd
have to implement to support histograms, should be much more generic, supporting a wider variety
of use cases.

The basic functionality which would be needed to support this feature, as discussed on the
mailing list, could be generalized simply as "named counters", and I think for simplicity
sake, we should limit the feature to be a mapping of names (type: String) to counts (type:
signed Long).

Additionally, I was thinking about this today, and I think it would be a good idea that when
this information is exposed in the client API, it should be retrievable through a user-supplied
aggregation/combiner function. The reasoning for this is that client code doesn't normally
deal with things at the granularity of files, but rather, the granularity of tablets, ranges,
and tables. That should probably be true for any new API to retrieve these data as well. And,
if that's the case, there will need to be some mechanism to aggregate the data from multiple
files for the requested range/tablet/table. A summation function would probably be the most
common, but certainly not the only useful aggregation function.

> Implement visibility histograms as a table feature
> --------------------------------------------------
>
>                 Key: ACCUMULO-4500
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4500
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client, tserver
>            Reporter: Josh Elser
>
> Add support to quickly extract a histogram of all of the visibilities stored in an Accumulo
table.
> DISCUSS: https://lists.apache.org/thread.html/df5e764362a95277344fd2731a432e9fafc60595e7d30015d9a56b9c@%3Cdev.accumulo.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message