apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tushar Gosavi <tus...@datatorrent.com>
Subject Re: Possibility of saving checkpoints on other distributed filesystems
Date Thu, 21 Jan 2016 08:24:50 GMT
Default FSStorageAgent can be used as it can work with local filesystem,
but I far as I know there is no support for specifying the directory
through xml file. by default it use the application directory on HDFS.

Not sure If we could specify storage agent with its properties through the
configuration at dag level.

- Tushar.


On Thu, Jan 21, 2016 at 12:14 PM, Aniruddha Thombare <
aniruddha@datatorrent.com> wrote:

> Hi,
>
> Do we have any storage agent which I can use readily, configurable through
> dt-site.xml?
>
> I am looking for something which would save checkpoints in mounted file
> system [eg. HA-NAS] which is basically just another directory for Apex.
>
>
>
>
> Thanks,
>
>
> Aniruddha
>
> On Wed, Jan 20, 2016 at 8:33 PM, Sandesh Hegde <sandesh@datatorrent.com>
> wrote:
>
> > It is already supported refer the following jira for more information,
> >
> > https://issues.apache.org/jira/browse/APEXCORE-283
> >
> >
> >
> > On Tue, Jan 19, 2016 at 10:43 PM Aniruddha Thombare <
> > aniruddha@datatorrent.com> wrote:
> >
> > > Hi,
> > >
> > > Is it possible to save checkpoints in any other highly available
> > > distributed file systems (which maybe mounted directories across the
> > > cluster) other than HDFS?
> > > If yes, is it configurable?
> > >
> > > AFAIK, there is no configurable option available to achieve that.
> > > If that's the case, can we have that feature?
> > >
> > > This is with the intention to recover the applications faster and do
> away
> > > with HDFS's small files problem as described here:
> > >
> > > http://blog.cloudera.com/blog/2009/02/the-small-files-problem/
> > >
> > >
> >
> http://snowplowanalytics.com/blog/2013/05/30/dealing-with-hadoops-small-files-problem/
> > > http://inquidia.com/news-and-info/working-small-files-hadoop-part-1
> > >
> > > If we could save checkpoints in some other distributed file system (or
> > even
> > > a HA NAS box) geared for small files, we could achieve -
> > >
> > >    - Better performance of NN & HDFS for the production usage (read:
> > >    production data I/O & not temp files)
> > >    - Faster application recovery in case of planned shutdown /
> unplanned
> > >    restarts
> > >
> > > Please, send your comments, suggestions or ideas.
> > >
> > > Thanks,
> > >
> > >
> > > Aniruddha
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message