Return-Path: X-Original-To: apmail-ignite-dev-archive@minotaur.apache.org Delivered-To: apmail-ignite-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6D36F18C53 for ; Thu, 22 Oct 2015 12:35:23 +0000 (UTC) Received: (qmail 20175 invoked by uid 500); 22 Oct 2015 12:35:23 -0000 Delivered-To: apmail-ignite-dev-archive@ignite.apache.org Received: (qmail 20134 invoked by uid 500); 22 Oct 2015 12:35:23 -0000 Mailing-List: contact dev-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list dev@ignite.apache.org Received: (qmail 20123 invoked by uid 99); 22 Oct 2015 12:35:23 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Oct 2015 12:35:23 +0000 Received: from mail-lf0-f43.google.com (mail-lf0-f43.google.com [209.85.215.43]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id BE9941A0181 for ; Thu, 22 Oct 2015 12:35:22 +0000 (UTC) Received: by lffz202 with SMTP id z202so46096557lff.3 for ; Thu, 22 Oct 2015 05:35:21 -0700 (PDT) X-Gm-Message-State: ALoCoQnlKZng/p5TfZVC2OJMm4qowPkv/ntoeX/x4QAA1RR1vU3JWwwZ8QiUlWqa62qLZa61PhTt X-Received: by 10.25.38.9 with SMTP id m9mr5330145lfm.112.1445517321354; Thu, 22 Oct 2015 05:35:21 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.29.50 with HTTP; Thu, 22 Oct 2015 05:35:01 -0700 (PDT) X-Originating-IP: [85.155.76.117] In-Reply-To: References: From: Raul Kripalani Date: Thu, 22 Oct 2015 13:35:01 +0100 Message-ID: Subject: Re: Data Snapshots in Ignite To: dev@ignite.apache.org Content-Type: multipart/alternative; boundary=001a113f203074ddf30522b0ba1a --001a113f203074ddf30522b0ba1a Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hey Sergey, I like your idea of Autocloseable snapshots! Regards, *Ra=C3=BAl Kripalani* PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data and Messaging Engineer http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani http://blog.raulkr.net | twitter: @raulvk On Wed, Oct 21, 2015 at 1:51 PM, Sergi Vladykin wrote: > Raul, > > Actually SQL indexes are already snapshotable. I'm not sure if it does ma= ke > sense to make > the whole cache (with full cache API support) snapshotable, but I like yo= ur > idea > about running multiple SQL statements against the same snapshot. > > Also I don't think that it is a good idea to keep snapshots for a long > time, > so I'd prefer to have typical AutoClosable API like: > > try (Snapshot s =3D ...) { > s.query(...); > s.query(...); > s.query(...); > } > > Though I'm not sure when we will be able to get down to this. > > Sergi > > 2015-10-21 12:06 GMT+03:00 Raul Kripalani : > > > Hey guys, > > > > LevelDb has a functionality called Snapshots which provides a consisten= t > > read-only view of the DB at a given point in time, against which querie= s > > can be executed. > > > > To my knowledge, this functionality doesn't exist in the world of open > > source In-Memory Computing. Ignite could be an innovator here. > > > > Ignite Snapshots would allow queries, distributed closures, map-reduce > > jobs, etc. It could be useful for Spark RDDs to avoid data shift while > the > > computation is taking place (not sure if there's already some form of > > snapshotting, though). Same for IGFS. > > > > Example usage: > > > > IgniteCacheSnapshot snapshot =3D > > ignite.cache("mycache").snapshots().create(); > > > > // all three queries are executed against a view of the cache at th= e > > point in time where it was snapshotted > > snapshot.query("select ..."); > > snapshot.query("select ..."); > > snapshot.query("select ..."); > > > > In fact, it would be awesome to be able to logically save this snapshot > > with a name so that later jobs, queries, etc. can run on top of it, e.g= .: > > > > IgniteCacheSnapshot snapshot =3D > > ignite.cache("mycache").snapshots().create("abc"); > > > > // ... > > // in another module of a distributed system, or in another thread = in > > parallel, use the saved snapshot > > IgniteCacheSnapshot snapshot =3D > > ignite.cache("mycache").snapshots().get("abc"); > > .... > > > > Named snapshotting can be dangerous due to data retention, e.g. imagine > > keeping a snapshot for 2 weeks! So we should force the user to specify = a > > TTL: > > > > IgniteCacheSnapshot snapshot =3D > > ignite.cache("mycache").snapshots().create("abc", 2, TimeUnit.HOURS); > > > > Such functionality would allow for "reporting checkpoints" and "time > > travel", for example, where you want users to be able to query the data > as > > it stood 1 hour ago, 2 hours ago, etc. > > > > What do you think? > > > > P.S.: We do have some form of snapshotting in the Compute checkpointing > > functionality =E2=80=93 but my proposal is to generalise the notion. > > > > Regards, > > > > *Ra=C3=BAl Kripalani* > > PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data a= nd > > Messaging Engineer > > http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalan= i > > http://blog.raulkr.net | twitter: @raulvk > > > --001a113f203074ddf30522b0ba1a--