accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Vines (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-452) Generalize locality groups
Date Thu, 08 Mar 2012 15:23:58 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225259#comment-13225259
] 

John Vines commented on ACCUMULO-452:
-------------------------------------

This feature by itself is nice, but it doesn't really do a whole lot without supporting expressions
( https://issues.apache.org/jira/browse/ACCUMULO-164 ) . But once you start allowing expressions,
you then need to support orders of locality, because data can only exist in a single locality
group. This can increase complexity for the user, or at the very least will make the API more
cludgy.

I'm all for making things pluggable, but we need to make it designed to ensure that things
are not easily borked by the user. This includes either forcing the interface to only handle
one locality group at a time or redesigning rfile to allow writing to different locality groups.
We also need to make sure we can optimize scan queries of data pre-dating the locality group.
Right now, it's really nice that data can only belong to one locality group. This will make
it so all data must be checked to see if it belongs in the new locality group, which will
make a big hit. Perhaps we should encourage majc after applying a new locality group.
                
> Generalize locality groups
> --------------------------
>
>                 Key: ACCUMULO-452
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-452
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: Keith Turner
>             Fix For: 1.5.0
>
>
> Locality groups are a neat feature, but there is no reason to limit partitioning to column
families.  Data could be partitioned based on any criteria.  For example if a user is interested
in querying recent data and ageing off old data partitioning locality groups based in timestamp
would be useful.  This could be accomplished by letting users specify a partitioner plugin
that is used at compaction and scan time.  Scans would need an ability to pass options to
the partitioner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message