apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaurav Gupta <gau...@datatorrent.com>
Subject Re: Possibility of saving checkpoints on other distributed filesystems
Date Thu, 21 Jan 2016 17:49:42 GMT
Aniruddha,

Currently we don't have any support for that.

Thanks
Gaurav

Thanks
-Gaurav

On Thu, Jan 21, 2016 at 12:24 AM, Tushar Gosavi <tushar@datatorrent.com>
wrote:

> Default FSStorageAgent can be used as it can work with local filesystem,
> but I far as I know there is no support for specifying the directory
> through xml file. by default it use the application directory on HDFS.
>
> Not sure If we could specify storage agent with its properties through the
> configuration at dag level.
>
> - Tushar.
>
>
> On Thu, Jan 21, 2016 at 12:14 PM, Aniruddha Thombare <
> aniruddha@datatorrent.com> wrote:
>
> > Hi,
> >
> > Do we have any storage agent which I can use readily, configurable
> through
> > dt-site.xml?
> >
> > I am looking for something which would save checkpoints in mounted file
> > system [eg. HA-NAS] which is basically just another directory for Apex.
> >
> >
> >
> >
> > Thanks,
> >
> >
> > Aniruddha
> >
> > On Wed, Jan 20, 2016 at 8:33 PM, Sandesh Hegde <sandesh@datatorrent.com>
> > wrote:
> >
> > > It is already supported refer the following jira for more information,
> > >
> > > https://issues.apache.org/jira/browse/APEXCORE-283
> > >
> > >
> > >
> > > On Tue, Jan 19, 2016 at 10:43 PM Aniruddha Thombare <
> > > aniruddha@datatorrent.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > Is it possible to save checkpoints in any other highly available
> > > > distributed file systems (which maybe mounted directories across the
> > > > cluster) other than HDFS?
> > > > If yes, is it configurable?
> > > >
> > > > AFAIK, there is no configurable option available to achieve that.
> > > > If that's the case, can we have that feature?
> > > >
> > > > This is with the intention to recover the applications faster and do
> > away
> > > > with HDFS's small files problem as described here:
> > > >
> > > > http://blog.cloudera.com/blog/2009/02/the-small-files-problem/
> > > >
> > > >
> > >
> >
> http://snowplowanalytics.com/blog/2013/05/30/dealing-with-hadoops-small-files-problem/
> > > > http://inquidia.com/news-and-info/working-small-files-hadoop-part-1
> > > >
> > > > If we could save checkpoints in some other distributed file system
> (or
> > > even
> > > > a HA NAS box) geared for small files, we could achieve -
> > > >
> > > >    - Better performance of NN & HDFS for the production usage (read:
> > > >    production data I/O & not temp files)
> > > >    - Faster application recovery in case of planned shutdown /
> > unplanned
> > > >    restarts
> > > >
> > > > Please, send your comments, suggestions or ideas.
> > > >
> > > > Thanks,
> > > >
> > > >
> > > > Aniruddha
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message