accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Fuchs <>
Subject Re: Time based locality groups
Date Thu, 08 Mar 2012 12:01:49 GMT
Yes, yes, yes, this is going to be a very useful feature set! (I told Andie
all about it and she agreed whole-heartedly)

I think that step one needs to be figuring out how to expose this in the
API, and the iterator interface is the place to start. Once we have defined
an abstraction layer, we can experiment with lots of different
implementations at the RFile layer. If we are going to broadly extend these
locality group-type filtering optimizations, it might make sense to drop
the specialization for column family filtering that is part of the
SortedKeyValueIterator seek method. Then we could support column family
filtering, timestamp filtering, cell-level security filtering, etc. as
separate iterators. The specialization for column family filtering is our
current mechanism for optimizing that operation in the RFile, but we could
be a little smarter about how we do this.

What I'm suggesting is that when we construct an iterator tree we look for
iterators on top of the RFile reader that we can collapse and implement as
part of the RFile reader. So, if a column family filtering iterator is on
top of the RFile then we can grab its set of column families and replace it
with the filtered RFile reader. If we add a little knowledge about
commutativity of iterators then we can even collapse filters that are not
directly on top of the RFile reader (like there might be a merging iterator
between the RFile reader and the column family filtering iterator). One way
we could implement this is by changing the factory method that generates
iterators. When this method calls the init method on a newly constructed
iterator it can instead push that iterator down through the tree and return
the source iterator instead. We might be able to specialize the iterator
environment to signal the optimization and avoid any changes to the API

Once we get to the point of optimizing the RFile, I think what we might
find is that the RFile entries are naturally grouped by time into blocks in
many cases. A simple timestamp-based block filter might be optimal in these
cases. This is what I was talking about with introducing extra features
(timestamp ranges, etc) into the RFile index. I think it also makes sense
to include some aggregate cell-level security markings here.

One other thing to think about: I like the simpler iterator interface, but
there are some implications to modifying the column family filter set
during a query that might be tricky. Does anybody change the column family
set mid-query now, anyway? Is that something we would want to support for
timestamps or other filters?


On Wed, Mar 7, 2012 at 9:05 PM, Keith Turner <> wrote:

> I was thinking of something like the following.   A locality group in
> an rfile would be comprised of arbitrary locality group metadata and
> key value pairs.
> interface Partitioner {
>    void init(LocalityGroupConfig lgc);
>    //method for determining what locality groups a compaction should
> create in the output RFile
>    //this method recieves metadata about locality groups in the files
> being compacted
>     List<LocalityGroupInfo>
> getLocalityGroupsToCreate(List<LocalityGroupMetadata> lgml);
>    //the following three methods are used to write data into a new
> RFile locality group
>    void startLocalityGroup(LocalityGroupInfo lgi);
>    //all data is passed throug this method it serves two purposes
> decide if data even goes in a locality group
>    //and for the data that is accepted build up the metadata for the
> locality group being created
>    boolean acceptKeyValue(Key k, Value v);
>    //once all data is written ask for the metadata and write that to the
> RFile
>    LocalityGroupMetadata finishLocalityGroup();
>    //method to select which locality groups in a RFile should be read
> by a scan or compaction
>    //this method is passed info about the existing locality groups in an
> RFile
>    List<LocalityGroupInfo>
> getLocalityGroupsToRead(List<LocalityGroupMetadata> lgml, ScanOptions
> so);
> }
> Keith
> On Wed, Mar 7, 2012 at 7:39 PM, Eric Newton <> wrote:
> > Something like this:
> >
> >    partition, meta = partitioner.choose(key, value, meta)
> >
> > The partition can be a string, which is used to look up the partitions'
> > configuration.  The meta information can be used by queries to avoid
> > including files from the partition in queries.  The metadata would be
> saved
> > at the close of the file.
> >
> > During a query, files could be filtered based on some arbitrary query
> data:
> >
> >    files = partitioner.selectFiles(files, query)
> >
> > I like it! It might also be nice to indicate some sort of "estimated"
> > percent of keys processed, and the type of compaction occurring (flush,
> > partial, everything):
> >
> >    partition, meta = partitioner.choose(key, value, meta, percent,
> > compactionType)
> >
> > Is there any other tablet-level information we might want to provide to a
> > partitioner?  Perhaps the source partition of the key/value?
> >
> > -Eric
> >
> > On Wed, Mar 7, 2012 at 6:54 PM, Keith Turner <> wrote:
> >
> >> Replying to myself :)
> >>
> >> The more I think about this, it seems that locality groups could
> >> handled by plugins that can parition the data and select locality
> >> groups in any way it likes. Want locality groups based on row suffix,
> >> go ahead and write the plugin.
> >>
> >> The plugin would be used for compaction time partitioning and scan
> >> time locality group selection.   User could pass options to the
> >> locality group plugin at scan time just like options are passed to
> >> iterators.    Maybe this is an extension or further generalization of
> >> the existing iterator framework, I have not thought through that far
> >> enough.
> >>
> >> Keith
> >>
> >> On Wed, Mar 7, 2012 at 6:22 PM, Keith Turner <> wrote:
> >> > We regularly have questions from users about querying new data and
> >> > aging off old data.  I was thinking about how we could better support
> >> > this in need in 1.5.  One thing that occurred to me is having locality
> >> > groups that were based on timestamp instead of column family.  For
> >> > example a locality group for each month.   Alternatively we could have
> >> > group for < day old, < week old, < month old, < year old. 
Would need
> >> > a way for users to define these.
> >> >
> >> > This would make scanning a table for recent data much faster.  Also
> >> > dropping old data could be made much faster by just dropping entire
> >> > locality groups at compaction time.
> >> >
> >> > One thing that irks me about this is : Should column family and time
> >> > based locality groups be mutually exclusive (i.e. an RFile has one or
> >> > the other, not both)?  If they are not then order of which is
> >> > partitioned first is important for query performance and would
> >> > probably need to be user configurable.
> >> >
> >> > Thoughts?
> >> >
> >> > Keith
> >>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message