apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chandni Singh <chan...@datatorrent.com>
Subject Re: Vagrant folders under Malhar/lib
Date Thu, 03 Sep 2015 18:05:02 GMT
The local mode was so far using FSStorageAgent which was used in
production.
In production using Async is needed because hdfs writes are slow but is
that the case with LocalMode?

In local mode if we use Async we are creating checkpoints under one local
directory and then copying it to another local directory which will not
improve any performance.

In my opinion StramLocalCluster use synchronous checkpointing as default.

Chandni



On Thu, Sep 3, 2015 at 10:09 AM, Chetan Narsude <chetan@datatorrent.com>
wrote:

> That sounds a lot like self contradicting reason; Let's make a change
> because we don't want to make change. :-)
>
> The code is in certain state. This certain state is consistent with how
> things run in production. In test environment there is a problem that stray
> files are created. It's a small fix to relocate these files elsewhere. What
> I am trying to understand is that is not being done?
>
> --
> Chetan
>
> On Thu, Sep 3, 2015 at 9:41 AM, Thomas Weise <thomas@datatorrent.com>
> wrote:
>
>> There is no need to configure anything extra with the proposed change, it
>> just brings back LM to how it worked before.
>>
>> There is no point modifying n tests for extra setup with no gain.
>>
>> Thomas
>>
>> On Thu, Sep 3, 2015 at 9:14 AM, Chetan Narsude <chetan@datatorrent.com>
>> wrote:
>>
>> > Why does it matter that AsyncFSStorageAgent is being used with
>> > LocalCluster? It using the localfs and hence no gain is the
>> implementation
>> > detail that's abstracted out by FileSystem already.
>> >
>> > If there is a problem of random artifacts left behind after the test,
>> there
>> > is a reason and most likely it's misconfiguration of the StorageAgent.
>> Why
>> > wouldn't that be fixed.
>> >
>> > --
>> > Chetan
>> >
>> >
>> > On Thu, Sep 3, 2015 at 8:59 AM, Amol Kekre <amol@datatorrent.com>
>> wrote:
>> >
>> > > Clean up container files left over should be a distributed OS task.
>> Clean
>> > > up, back up, archive, ... all is for the OS (aka YARN). We must assume
>> > kill
>> > > -9.
>> > >
>> > > The only thing where the operator comes into play is "teardown()",
>> which
>> > is
>> > > business logic (not Apex engine) issue. This could be db connection
>> etc.
>> > >
>> > > Thks,
>> > > Amol
>> > >
>> > > On Thu, Sep 3, 2015 at 8:52 AM, Thomas Weise <thomas@datatorrent.com>
>> > > wrote:
>> > >
>> > > > When the container gets killed, we should not assume anything about
>> > > > cleanup. It can be a kill -9. Any related "cleanup" falls under
>> nice to
>> > > > have, no guarantees.
>> > > >
>> > > > On Thu, Sep 3, 2015 at 8:49 AM, Chandni Singh <
>> chandni@datatorrent.com
>> > >
>> > > > wrote:
>> > > >
>> > > > > I have a question regarding what Gaurav mentioned
>> > > > > ----
>> > > > > When container runs in cluster, "." specifies the containers
local
>> > path
>> > > > on
>> > > > > the node where container specific jars and other resources
>> resides.
>> > It
>> > > > > creates a folder under that which is live as long as container
>> lives.
>> > > So
>> > > > > there are no vagrant folders anywhere
>> > > > > ---
>> > > > >
>> > > > > When the container gets killed, do we cleanup the folders created
>> by
>> > > > Async
>> > > > > under the containers working dir?
>> > > > >
>> > > > > On Thu, Sep 3, 2015 at 8:42 AM, Thomas Weise <
>> thomas@datatorrent.com
>> > >
>> > > > > wrote:
>> > > > >
>> > > > >> It makes sense to use the synchronous checkpointing for the
local
>> > > mode.
>> > > > >> LM is meant to simplify dependencies and setup. The default
for
>> > > > execution
>> > > > >> on YARN remains async.
>> > > > >>
>> > > > >> Thomas
>> > > > >>
>> > > > >>
>> > > > >> On Thu, Sep 3, 2015 at 8:34 AM, Chandni Singh <
>> > > chandni@datatorrent.com>
>> > > > >> wrote:
>> > > > >>
>> > > > >>> APPLICATION_PATH isn't related to local base dir of Async
as far
>> > as I
>> > > > >>> know. StramLocalCluster sets the APP_PATH to "target/...".
>> > > > >>> StramLocalCluster should use FSStorageAgent.
>> > > > >>>
>> > > > >>> - Chandni
>> > > > >>>
>> > > > >>> On Thu, Sep 3, 2015 at 8:20 AM, Gaurav Gupta <
>> > gaurav@datatorrent.com
>> > > >
>> > > > >>> wrote:
>> > > > >>>
>> > > > >>>> As Thomas mentioned as default remains to be async.
You can
>> either
>> > > > >>>> change the storage agent or set the APPLICATION_PATH.
>> > > > >>>>
>> > > > >>>> When container runs in cluster, "." specifies the
containers
>> local
>> > > > path
>> > > > >>>> on the node where container specific jars and other
resources
>> > > > resides. It
>> > > > >>>> creates a folder under that which is live as long
as container
>> > > lives.
>> > > > So
>> > > > >>>> there are no vagrant folders anywhere
>> > > > >>>>
>> > > > >>>> Thanks
>> > > > >>>> -Gaurav
>> > > > >>>>
>> > > > >>>> On Wed, Sep 2, 2015 at 11:33 PM, Chandni Singh <
>> > > > chandni@datatorrent.com
>> > > > >>>> > wrote:
>> > > > >>>>
>> > > > >>>>> I think there is a problem in the default Async
as well. It
>> also
>> > > uses
>> > > > >>>>> the working directory as its local base path.
>> > > > >>>>>
>> > > > >>>>> In the Async -> copyToHdfs()  method, we delete
the window
>> files
>> > > but
>> > > > >>>>> the folder with the operator name never gets
deleted.
>> > > > >>>>> So on the cluster there  will be such vagrant
folders in the
>> > > working
>> > > > >>>>> directory?
>> > > > >>>>>
>> > > > >>>>> On Wed, Sep 2, 2015 at 11:17 PM, Thomas Weise
<
>> > > > thomas@datatorrent.com>
>> > > > >>>>> wrote:
>> > > > >>>>>
>> > > > >>>>>> Chandni,
>> > > > >>>>>>
>> > > > >>>>>> Agreed. See whether the tests work with the
synchronous
>> storage
>> > > > >>>>>> agent. If yes, change them. The default needs
to remain
>> async.
>> > > > >>>>>>
>> > > > >>>>>> Thomas
>> > > > >>>>>>
>> > > > >>>>>>
>> > > > >>>>>> On Wed, Sep 2, 2015 at 11:05 PM, Chandni
Singh <
>> > > > >>>>>> chandni@datatorrent.com> wrote:
>> > > > >>>>>>
>> > > > >>>>>>> Hi,
>> > > > >>>>>>>
>> > > > >>>>>>> I would like to know what was the reason
to use
>> > > AsyncFSStorageAgent
>> > > > >>>>>>> with StramLocalCluster?
>> > > > >>>>>>> StramLocalCluster is mainly for testing
in a non-distributed
>> > mode
>> > > > >>>>>>> and I am unclear how AsyncFSStorageAgent
is helpful in this
>> > mode.
>> > > > >>>>>>>
>> > > > >>>>>>> Thanks,
>> > > > >>>>>>> Chandni
>> > > > >>>>>>>
>> > > > >>>>>>> On Wed, Sep 2, 2015 at 10:45 PM, Chandni
Singh <
>> > > > >>>>>>> chandni@datatorrent.com> wrote:
>> > > > >>>>>>>
>> > > > >>>>>>>> This is because of recent changes
to StramLocalCluster
>> where
>> > > > >>>>>>>> AsyncFSStorageAgent is used for checkpointing
>> > > > >>>>>>>>
>> > > > >>>>>>>> dag.setAttribute(OperatorContext.STORAGE_AGENT,
new
>> > > > AsyncFSStorageAgent(new Path(pathUri,
>> > > > LogicalPlan.SUBDIR_CHECKPOINTS).toString(), null));
>> > > > >>>>>>>>
>> > > > >>>>>>>> The AsyncFSStorageAgent(String path,
Configuration conf)
>> uses
>> > > "."
>> > > > as localBasePath and therefore creates sub-directories per operator
>> in
>> > > the
>> > > > current working directory.
>> > > > >>>>>>>>
>> > > > >>>>>>>> I am going to create a ticket to
address this and will fix
>> it.
>> > > > >>>>>>>>
>> > > > >>>>>>>> -Chandni
>> > > > >>>>>>>>
>> > > > >>>>>>>>
>> > > > >>>>>>>> On Wed, Sep 2, 2015 at 7:13 PM, Chandni
Singh <
>> > > > >>>>>>>> chandni@datatorrent.com> wrote:
>> > > > >>>>>>>>
>> > > > >>>>>>>>> Hi,
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> I can see empty folders getting
created under Malhar/lib
>> > called
>> > > > >>>>>>>>> '1' and '2'.
>> > > > >>>>>>>>> I think this is because of using
LocalMode to run a test
>> > > > >>>>>>>>> application.
>> > > > >>>>>>>>>
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> If anyone has checked in such
cases please do check and
>> let
>> > us
>> > > > >>>>>>>>> know.
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> Thanks,
>> > > > >>>>>>>>> Chandni
>> > > > >>>>>>>>>
>> > > > >>>>>>>>
>> > > > >>>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>
>> > > > >>>>>
>> > > > >>>>
>> > > > >>>
>> > > > >>
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message