accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-452) Generalize locality groups
Date Thu, 08 Mar 2012 18:09:57 GMT


Todd Lipcon commented on ACCUMULO-452:

FWIW, in HBase, we maintain timestamp min/max per HFile, and use that to cull files at query
time if the query has a timestamp range predicate. As of fairly recently we also support culling
these files at compaction time without having to rewrite them, if a file completely falls
out of the configured table TTL. (variously related to HBASE-5199, HBASE-5274, HBASE-5010,

I also somewhat agree with Aaron's sentiment above - these timestamp optimizations were pretty
easy to do in HBase because timestamp is a first class citizen feature instead of something
implemented by a more general framework.
> Generalize locality groups
> --------------------------
>                 Key: ACCUMULO-452
>                 URL:
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: Keith Turner
>             Fix For: 1.5.0
> Locality groups are a neat feature, but there is no reason to limit partitioning to column
families.  Data could be partitioned based on any criteria.  For example if a user is interested
in querying recent data and ageing off old data partitioning locality groups based in timestamp
would be useful.  This could be accomplished by letting users specify a partitioner plugin
that is used at compaction and scan time.  Scans would need an ability to pass options to
the partitioner.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message