accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Cordova (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-452) Generalize locality groups
Date Thu, 08 Mar 2012 13:45:58 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225204#comment-13225204
] 

Aaron Cordova commented on ACCUMULO-452:
----------------------------------------

I know this might sound crazy but 'more general' doesn't always mean better. There is a cost
to generality and it is complexity. For example, several people have read the MapReduce paper
and said 'oh you can easily create a more general computation framework than that, that includes
message passing etc' and those people miss the point of why MapReduce is so widely adopted
and why more general systems like MPI are not - its simplicity.

In this case, the cost is that the user has to now decide at what level to group data to get
the locality they desire, and pass options where they normally might not. Users already have
the ability to use the column family to store whatever data they want, knowing that the data
they store in column families can be used for physical partitioning.

So I suppose I'm looking for more justification before adding more complexity to an admittedly
already more general/complex implementation of BigTable.
                
> Generalize locality groups
> --------------------------
>
>                 Key: ACCUMULO-452
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-452
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: Keith Turner
>             Fix For: 1.5.0
>
>
> Locality groups are a neat feature, but there is no reason to limit partitioning to column
families.  Data could be partitioned based on any criteria.  For example if a user is interested
in querying recent data and ageing off old data partitioning locality groups based in timestamp
would be useful.  This could be accomplished by letting users specify a partitioner plugin
that is used at compaction and scan time.  Scans would need an ability to pass options to
the partitioner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message