accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-452) Generalize locality groups
Date Thu, 08 Mar 2012 21:19:57 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225531#comment-13225531
] 

Keith Turner commented on ACCUMULO-452:
---------------------------------------

Users do want this capability.  They keep asking for it.  We do turn around them and tell
them to sort their data differently.  They don't always like that answer.  The intent of this
is to meet a user need.

Storing temporal information in the column family is a possibility. It would work well for
some cases, like having two locality groups one thats the current month and another thats
everything else.  You put the month in the column family and reconfigure the locality groups
every month.

However, if you would like something like LG1 = < day old, LG2 = < month old, LG3 =
< year old this would not be possible w/ the current locality group implementation. However
ACCUMULO-164 may make this possible.  Store time to the day in the column family.  John pointed
out one problem w/ this, its hard to automatically determine that patterns match disjoint
sets.  I need to think through ACCUMULO-164 some more and see what the possible gotchas are.

If you have to duplicate the data in the timestamp into your column family to accomplish your
goals, does this indicate a problem with the model?  It do not think its clean, but its ok
w/ me.

  


                
> Generalize locality groups
> --------------------------
>
>                 Key: ACCUMULO-452
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-452
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: Keith Turner
>             Fix For: 1.5.0
>
>         Attachments: PartitionerDesign.txt
>
>
> Locality groups are a neat feature, but there is no reason to limit partitioning to column
families.  Data could be partitioned based on any criteria.  For example if a user is interested
in querying recent data and ageing off old data partitioning locality groups based in timestamp
would be useful.  This could be accomplished by letting users specify a partitioner plugin
that is used at compaction and scan time.  Scans would need an ability to pass options to
the partitioner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message