ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Setrakyan <dsetrak...@apache.org>
Subject Re: Data Snapshots in Ignite
Date Tue, 27 Oct 2015 01:53:50 GMT
On Mon, Oct 26, 2015 at 5:31 AM, Raul Kripalani <raulk@apache.org> wrote:

> Hi,
>
> Thanks all for chiming in. It seems like this feature could be of interest
> to the user community, so I've opened a ticket to continue maturing the
> idea there:
>
> https://issues.apache.org/jira/browse/IGNITE-1789
>
> We may need to create a Wiki page later to collaborate around specifics and
> design.
>

Thanks Raul. I agree that Wiki page may be in order. I have responded in
the ticket. Take a look and see if you agree with my thinking.


> Regards,
>
> *Raúl Kripalani*
> PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data and
> Messaging Engineer
> http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
> http://blog.raulkr.net | twitter: @raulvk
>
> On Wed, Oct 21, 2015 at 10:06 AM, Raul Kripalani <raulk@apache.org> wrote:
>
> > Hey guys,
> >
> > LevelDb has a functionality called Snapshots which provides a consistent
> > read-only view of the DB at a given point in time, against which queries
> > can be executed.
> >
> > To my knowledge, this functionality doesn't exist in the world of open
> > source In-Memory Computing. Ignite could be an innovator here.
> >
> > Ignite Snapshots would allow queries, distributed closures, map-reduce
> > jobs, etc. It could be useful for Spark RDDs to avoid data shift while
> the
> > computation is taking place (not sure if there's already some form of
> > snapshotting, though). Same for IGFS.
> >
> > Example usage:
> >
> >     IgniteCacheSnapshot snapshot =
> > ignite.cache("mycache").snapshots().create();
> >
> >     // all three queries are executed against a view of the cache at the
> > point in time where it was snapshotted
> >     snapshot.query("select ...");
> >     snapshot.query("select ...");
> >     snapshot.query("select ...");
> >
> > In fact, it would be awesome to be able to logically save this snapshot
> > with a name so that later jobs, queries, etc. can run on top of it, e.g.:
> >
> >     IgniteCacheSnapshot snapshot =
> > ignite.cache("mycache").snapshots().create("abc");
> >
> >     // ...
> >     // in another module of a distributed system, or in another thread in
> > parallel, use the saved snapshot
> >     IgniteCacheSnapshot snapshot =
> > ignite.cache("mycache").snapshots().get("abc");
> >     ....
> >
> > Named snapshotting can be dangerous due to data retention, e.g. imagine
> > keeping a snapshot for 2 weeks! So we should force the user to specify a
> > TTL:
> >
> >     IgniteCacheSnapshot snapshot =
> > ignite.cache("mycache").snapshots().create("abc", 2, TimeUnit.HOURS);
> >
> > Such functionality would allow for "reporting checkpoints" and "time
> > travel", for example, where you want users to be able to query the data
> as
> > it stood 1 hour ago, 2 hours ago, etc.
> >
> > What do you think?
> >
> > P.S.: We do have some form of snapshotting in the Compute checkpointing
> > functionality – but my proposal is to generalise the notion.
> >
> > Regards,
> >
> > *Raúl Kripalani*
> > PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data and
> > Messaging Engineer
> > http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
> > http://blog.raulkr.net | twitter: @raulvk
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message