apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Weise <tho...@datatorrent.com>
Subject Re: Vagrant folders under Malhar/lib
Date Thu, 03 Sep 2015 20:43:44 GMT
Good point regarding the coverage. These JUnit tests are supposed to test
individual components and all the tests collectively should strive to
achieve high coverage. There are tests in Apex to cover storage agents,
recovery semantics etc.  Components that fall outside of the test scope are
reduced as much as possible through mocks (even though there is room for
improvement).

The tests in Malhar are for operators and applications, not for the engine.
In those cases where LM is used, the intention is to test the application
functionality. It is expected that certain configurations are adjusted for
the test and dependencies mocked.

For the local mode, it should not be an issue to use a different storage
agent when it simplifies the test execution. Specifically, in this case, we
don't want to go and change many tests to make something work that isn't
needed. LM is not "production", it is not using HDFS and there are a number
of other important differences that make it possible to run within the IDE.

Instead, focus should be on those things that help with app coverage. For
example, in the past we had seen issues with serialization of operators
that were not uncovered in LM, until we made the serialization part of the
execution, even when it was not needed for execution.

Thomas




On Thu, Sep 3, 2015 at 11:21 AM, Chetan Narsude <chetan@datatorrent.com>
wrote:

> I think Ram explained in a little more detail on what I am thinking.
>
> Tests are supposed to provide code coverage. Having localcluster is already
> a variable, it's not what runs in production. Having a different storage
> agent is another variable and it misses out on testing the asynchronous
> flow. The gap keeps on increasing if we continue to do that.  AsyncFSSA is
> our default because it's supposed to do everything that FSStorageAgent does
> and some more. So not clear as to why the test which creates stray folders
> is not configuring the storage agent properly instead of completely
> changing it out which brings some other problems in as I just explained.
>
> If changing the storage agent is the only way to fix the problem with
> reasonable effort, then I would concede. I highly doubt that.
>
> --
> Chetan
>
> On Thu, Sep 3, 2015 at 11:05 AM, Chandni Singh <chandni@datatorrent.com>
> wrote:
>
> > The local mode was so far using FSStorageAgent which was used in
> > production.
> > In production using Async is needed because hdfs writes are slow but is
> > that the case with LocalMode?
> >
> > In local mode if we use Async we are creating checkpoints under one local
> > directory and then copying it to another local directory which will not
> > improve any performance.
> >
> > In my opinion StramLocalCluster use synchronous checkpointing as default.
> >
> > Chandni
> >
> >
> >
> > On Thu, Sep 3, 2015 at 10:09 AM, Chetan Narsude <chetan@datatorrent.com>
> > wrote:
> >
> >> That sounds a lot like self contradicting reason; Let's make a change
> >> because we don't want to make change. :-)
> >>
> >> The code is in certain state. This certain state is consistent with how
> >> things run in production. In test environment there is a problem that
> stray
> >> files are created. It's a small fix to relocate these files elsewhere.
> What
> >> I am trying to understand is that is not being done?
> >>
> >> --
> >> Chetan
> >>
> >> On Thu, Sep 3, 2015 at 9:41 AM, Thomas Weise <thomas@datatorrent.com>
> >> wrote:
> >>
> >>> There is no need to configure anything extra with the proposed change,
> it
> >>> just brings back LM to how it worked before.
> >>>
> >>> There is no point modifying n tests for extra setup with no gain.
> >>>
> >>> Thomas
> >>>
> >>> On Thu, Sep 3, 2015 at 9:14 AM, Chetan Narsude <chetan@datatorrent.com
> >
> >>> wrote:
> >>>
> >>> > Why does it matter that AsyncFSStorageAgent is being used with
> >>> > LocalCluster? It using the localfs and hence no gain is the
> >>> implementation
> >>> > detail that's abstracted out by FileSystem already.
> >>> >
> >>> > If there is a problem of random artifacts left behind after the test,
> >>> there
> >>> > is a reason and most likely it's misconfiguration of the
> StorageAgent.
> >>> Why
> >>> > wouldn't that be fixed.
> >>> >
> >>> > --
> >>> > Chetan
> >>> >
> >>> >
> >>> > On Thu, Sep 3, 2015 at 8:59 AM, Amol Kekre <amol@datatorrent.com>
> >>> wrote:
> >>> >
> >>> > > Clean up container files left over should be a distributed OS
task.
> >>> Clean
> >>> > > up, back up, archive, ... all is for the OS (aka YARN). We must
> >>> assume
> >>> > kill
> >>> > > -9.
> >>> > >
> >>> > > The only thing where the operator comes into play is "teardown()",
> >>> which
> >>> > is
> >>> > > business logic (not Apex engine) issue. This could be db connection
> >>> etc.
> >>> > >
> >>> > > Thks,
> >>> > > Amol
> >>> > >
> >>> > > On Thu, Sep 3, 2015 at 8:52 AM, Thomas Weise <
> thomas@datatorrent.com
> >>> >
> >>> > > wrote:
> >>> > >
> >>> > > > When the container gets killed, we should not assume anything
> about
> >>> > > > cleanup. It can be a kill -9. Any related "cleanup" falls
under
> >>> nice to
> >>> > > > have, no guarantees.
> >>> > > >
> >>> > > > On Thu, Sep 3, 2015 at 8:49 AM, Chandni Singh <
> >>> chandni@datatorrent.com
> >>> > >
> >>> > > > wrote:
> >>> > > >
> >>> > > > > I have a question regarding what Gaurav mentioned
> >>> > > > > ----
> >>> > > > > When container runs in cluster, "." specifies the containers
> >>> local
> >>> > path
> >>> > > > on
> >>> > > > > the node where container specific jars and other resources
> >>> resides.
> >>> > It
> >>> > > > > creates a folder under that which is live as long as
container
> >>> lives.
> >>> > > So
> >>> > > > > there are no vagrant folders anywhere
> >>> > > > > ---
> >>> > > > >
> >>> > > > > When the container gets killed, do we cleanup the folders
> >>> created by
> >>> > > > Async
> >>> > > > > under the containers working dir?
> >>> > > > >
> >>> > > > > On Thu, Sep 3, 2015 at 8:42 AM, Thomas Weise <
> >>> thomas@datatorrent.com
> >>> > >
> >>> > > > > wrote:
> >>> > > > >
> >>> > > > >> It makes sense to use the synchronous checkpointing
for the
> >>> local
> >>> > > mode.
> >>> > > > >> LM is meant to simplify dependencies and setup.
The default
> for
> >>> > > > execution
> >>> > > > >> on YARN remains async.
> >>> > > > >>
> >>> > > > >> Thomas
> >>> > > > >>
> >>> > > > >>
> >>> > > > >> On Thu, Sep 3, 2015 at 8:34 AM, Chandni Singh <
> >>> > > chandni@datatorrent.com>
> >>> > > > >> wrote:
> >>> > > > >>
> >>> > > > >>> APPLICATION_PATH isn't related to local base
dir of Async as
> >>> far
> >>> > as I
> >>> > > > >>> know. StramLocalCluster sets the APP_PATH to
"target/...".
> >>> > > > >>> StramLocalCluster should use FSStorageAgent.
> >>> > > > >>>
> >>> > > > >>> - Chandni
> >>> > > > >>>
> >>> > > > >>> On Thu, Sep 3, 2015 at 8:20 AM, Gaurav Gupta
<
> >>> > gaurav@datatorrent.com
> >>> > > >
> >>> > > > >>> wrote:
> >>> > > > >>>
> >>> > > > >>>> As Thomas mentioned as default remains to
be async. You can
> >>> either
> >>> > > > >>>> change the storage agent or set the APPLICATION_PATH.
> >>> > > > >>>>
> >>> > > > >>>> When container runs in cluster, "." specifies
the containers
> >>> local
> >>> > > > path
> >>> > > > >>>> on the node where container specific jars
and other
> resources
> >>> > > > resides. It
> >>> > > > >>>> creates a folder under that which is live
as long as
> container
> >>> > > lives.
> >>> > > > So
> >>> > > > >>>> there are no vagrant folders anywhere
> >>> > > > >>>>
> >>> > > > >>>> Thanks
> >>> > > > >>>> -Gaurav
> >>> > > > >>>>
> >>> > > > >>>> On Wed, Sep 2, 2015 at 11:33 PM, Chandni
Singh <
> >>> > > > chandni@datatorrent.com
> >>> > > > >>>> > wrote:
> >>> > > > >>>>
> >>> > > > >>>>> I think there is a problem in the default
Async as well. It
> >>> also
> >>> > > uses
> >>> > > > >>>>> the working directory as its local base
path.
> >>> > > > >>>>>
> >>> > > > >>>>> In the Async -> copyToHdfs()  method,
we delete the window
> >>> files
> >>> > > but
> >>> > > > >>>>> the folder with the operator name never
gets deleted.
> >>> > > > >>>>> So on the cluster there  will be such
vagrant folders in
> the
> >>> > > working
> >>> > > > >>>>> directory?
> >>> > > > >>>>>
> >>> > > > >>>>> On Wed, Sep 2, 2015 at 11:17 PM, Thomas
Weise <
> >>> > > > thomas@datatorrent.com>
> >>> > > > >>>>> wrote:
> >>> > > > >>>>>
> >>> > > > >>>>>> Chandni,
> >>> > > > >>>>>>
> >>> > > > >>>>>> Agreed. See whether the tests work
with the synchronous
> >>> storage
> >>> > > > >>>>>> agent. If yes, change them. The
default needs to remain
> >>> async.
> >>> > > > >>>>>>
> >>> > > > >>>>>> Thomas
> >>> > > > >>>>>>
> >>> > > > >>>>>>
> >>> > > > >>>>>> On Wed, Sep 2, 2015 at 11:05 PM,
Chandni Singh <
> >>> > > > >>>>>> chandni@datatorrent.com> wrote:
> >>> > > > >>>>>>
> >>> > > > >>>>>>> Hi,
> >>> > > > >>>>>>>
> >>> > > > >>>>>>> I would like to know what was
the reason to use
> >>> > > AsyncFSStorageAgent
> >>> > > > >>>>>>> with StramLocalCluster?
> >>> > > > >>>>>>> StramLocalCluster is mainly
for testing in a
> >>> non-distributed
> >>> > mode
> >>> > > > >>>>>>> and I am unclear how AsyncFSStorageAgent
is helpful in
> this
> >>> > mode.
> >>> > > > >>>>>>>
> >>> > > > >>>>>>> Thanks,
> >>> > > > >>>>>>> Chandni
> >>> > > > >>>>>>>
> >>> > > > >>>>>>> On Wed, Sep 2, 2015 at 10:45
PM, Chandni Singh <
> >>> > > > >>>>>>> chandni@datatorrent.com>
wrote:
> >>> > > > >>>>>>>
> >>> > > > >>>>>>>> This is because of recent
changes to StramLocalCluster
> >>> where
> >>> > > > >>>>>>>> AsyncFSStorageAgent is used
for checkpointing
> >>> > > > >>>>>>>>
> >>> > > > >>>>>>>> dag.setAttribute(OperatorContext.STORAGE_AGENT,
new
> >>> > > > AsyncFSStorageAgent(new Path(pathUri,
> >>> > > > LogicalPlan.SUBDIR_CHECKPOINTS).toString(), null));
> >>> > > > >>>>>>>>
> >>> > > > >>>>>>>> The AsyncFSStorageAgent(String
path, Configuration conf)
> >>> uses
> >>> > > "."
> >>> > > > as localBasePath and therefore creates sub-directories per
> >>> operator in
> >>> > > the
> >>> > > > current working directory.
> >>> > > > >>>>>>>>
> >>> > > > >>>>>>>> I am going to create a ticket
to address this and will
> >>> fix it.
> >>> > > > >>>>>>>>
> >>> > > > >>>>>>>> -Chandni
> >>> > > > >>>>>>>>
> >>> > > > >>>>>>>>
> >>> > > > >>>>>>>> On Wed, Sep 2, 2015 at 7:13
PM, Chandni Singh <
> >>> > > > >>>>>>>> chandni@datatorrent.com>
wrote:
> >>> > > > >>>>>>>>
> >>> > > > >>>>>>>>> Hi,
> >>> > > > >>>>>>>>>
> >>> > > > >>>>>>>>> I can see empty folders
getting created under
> Malhar/lib
> >>> > called
> >>> > > > >>>>>>>>> '1' and '2'.
> >>> > > > >>>>>>>>> I think this is because
of using LocalMode to run a
> test
> >>> > > > >>>>>>>>> application.
> >>> > > > >>>>>>>>>
> >>> > > > >>>>>>>>>
> >>> > > > >>>>>>>>> If anyone has checked
in such cases please do check and
> >>> let
> >>> > us
> >>> > > > >>>>>>>>> know.
> >>> > > > >>>>>>>>>
> >>> > > > >>>>>>>>> Thanks,
> >>> > > > >>>>>>>>> Chandni
> >>> > > > >>>>>>>>>
> >>> > > > >>>>>>>>
> >>> > > > >>>>>>>>
> >>> > > > >>>>>>>
> >>> > > > >>>>>>
> >>> > > > >>>>>
> >>> > > > >>>>
> >>> > > > >>>
> >>> > > > >>
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message