accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: Is anyone using serialized iterators to provide provenance data?
Date Thu, 16 May 2013 20:56:11 GMT
On Wed, May 15, 2013 at 9:15 PM, Christopher <ctubbsii@apache.org> wrote:

> Seems to me this is nothing more than "clone and also add these
> per-table iterators on all scopes". Might be a neat little utility to
>

Clone has always had this.  When cloning a table, a set of props to set and
exclude (not copy from source) can be specified.  These config changes are
made before any tablet in the clone is ever brought online.


> wrap those features into a single step from the user's perspective.
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Wed, May 15, 2013 at 8:58 PM, Josh Elser <josh.elser@gmail.com> wrote:
> > Oh, I see what you mean. Table B was created from table A with a
> function F
> > (where F is some collection of iterators like you said).
> >
> > It could be a neat application of the clone command. Storing that
> > information on table B is some exercise in where to put that immutable
> > information (that's me ignoring that problem :P).
> >
> > You say git: do you actually intend to have a cheap replay ability? Or
> > merely be able to view the history and be able to work through the
> > transformations again?
> >
> > Seems reasonable for a 1.6 wish to me.
> >
> >
> > On 05/15/2013 08:44 PM, David Medinets wrote:
> >>
> >> I don't see those as covering the same ground. Let's say I have an
> >> Accumulo table for a given human's genome. As a scientist, I want to
> apply a
> >> set of filters to create a subset of the genome. This provides a
> transform
> >> from data-set A to data-set B. Since iterators were used for the
> transform,
> >> we could serialize the set of iterators used by the transformation. Both
> >> data-sets are immutable. Think git for data-sets.
> >>
> >>
> >> On Wed, May 15, 2013 at 4:25 PM, Christopher <ctubbsii@apache.org
> >> <mailto:ctubbsii@apache.org>> wrote:
> >>
> >>     I think this might relate to ACCUMULO-1397, in the form of
> providing a
> >>     mechanism to specify iterator profiles, or ACCUMULO-415.
> >>
> >>     --
> >>     Christopher L Tubbs II
> >>     http://gravatar.com/ctubbsii
> >>
> >>
> >>     On Wed, May 15, 2013 at 2:51 PM, David Medinets
> >>     <david.medinets@gmail.com <mailto:david.medinets@gmail.com>>
wrote:
> >>     > If you apply a set of iterators to one table to produce another,
> >>     it seems
> >>     > possible to serialize the iterator stack alongside the new table
> >>     in some
> >>     > catalog to provide provenance. The assumption is that the tables
> are
> >>     > immutable, I think. Is anyone doing this or has anyone thought
> >>     about doing
> >>     > so? Just curious and wanted to ask before I forgot about the idea.
> >>
> >>
> >
>

Mime
View raw message