asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eldon Carman <ecarm...@ucr.edu>
Subject Re: Tasks remaining for release
Date Wed, 08 Jul 2015 00:44:08 GMT
In my branch ("ecarm002/introspection_alternate"), I have adapted some code
I received from Ildar to repeatedly test a set of runtime tests. I am not
sure this testing process will be related to your issue or not. I found
this class very helpful in finding the error that was causing my problem
for introspection. You could add the feeds test to the
repeatedtestsuite.xml and try running it. The process might help you cause
the error locally.

https://github.com/ecarm002/incubator-asterixdb/tree/ecarm002/introspection_alternate

edu.uci.ics.asterix.test.runtime.RepteatedTest




On Mon, Jul 6, 2015 at 8:25 PM, Ian Maxon <imaxon@uci.edu> wrote:

> Raman and I worked on getting to the root of what is causing the build
> instability for a while today. The investigation is still ongoing but
> so far we've discovered the following things:
>
> - The OOM error specifically is running out of threads to create on
> the machine, which is odd. We aren't creating more than 500 threads
> per JVM during testing so this is especially puzzling. The heap size
> or permgen size are not the issue.
>
> - The OOM error can be observed at the point where only feeds was
> merged (and not YARN or the managix scripting fix)
>
> - Neither of us can reproduce this locally on our development
> machines. It seems that the environment is a variable in this issue
> (hitting the thread limit on the machine), somehow.
>
> - Where or if the tests run out of threads is not deterministic. It
> tends to fail around the feeds portion of the execution tests, but
> this is only a loose pattern. They can all pass, or the OOM can be hit
> during integration tests, or other totally unrelated execution tests.
>
> - There are a few feeds tests which sometimes fail (namely issue_711
> and feeds_10) but this is totally unrelated to the more major issues
> of running out of threads on the build machine.
>
> Given all the above, it looks like there is at least a degree of
> configuration/environmental influence on this issue.
>
> - Ian
>
>
>
> On Mon, Jul 6, 2015 at 2:14 PM, Raman Grover <ramangrover29@gmail.com>
> wrote:
> > Hi
> >
> > a) The two big commits to the master (YARN integration and feeds)
> happened
> > as atomic units that makes it easier to
> > reset the master to the version prior to each feature and verify if the
> > build began showing OOM after each of the suspected commits. We have a
> > pretty deterministic way of nailing down the commit that introduced the
> > problem. I would suggest, instead of disabling the feeds tests, can we
> > revert to the earlier commit and confirm if the feeds commit did
> introduce
> > the behavior and repeat the test with the YARN commit that followed. We
> > should be able to see sudden increase/drop in build stability by running
> > sufficient number of iterations.
> >
> > b) I have not been able to reproduce the OOM at my setup where I have
> been
> > running the build repeatedly.
> > @Ian are you able to reproduce it at your system? May be I am not running
> > the build sufficient number of times?
> > I am still not able to understand how removal of test cases still causes
> > the OOM? I can go back and look at the precise changes made during the
> > feeds commit that could introduce OOM even if feeds are not involved at
> > all, but as I see it, the changes made do not play a role if feeds are
> not
> > being ingested.
> >
> >
> > Regards,
> > Raman
> >
> >
> > On Thu, Jul 2, 2015 at 6:42 PM, Ian Maxon <imaxon@uci.edu> wrote:
> >
> >> Hi all,
> >>
> >> We are close to having a release ready, but there's a few things left
> >> on the checklist before we can cut the first Apache release. I think
> >> most things on this list are underway, but I'll put them here just for
> >> reference/visibility. Comments and thoughts are welcomed.
> >>
> >> - Build stability after merging YARN and Feeds seems to have seriously
> >> declined. It's hard to get a build to go through to the end without
> >> going OOM at all now honestly, so this is a Problem. I think it may be
> >> related to Feeds, but even after disabling the tests
> >> (https://asterix-gerrit.ics.uci.edu/#/c/312/), I still see it.
> >> Therefore I am not precisely sure what is going on, but it only
> >> started to happen after we merged those two features. It's not exactly
> >> obvious to me where the memory leak is coming from. @Raman, it would
> >> be great to get your advice/thoughts on this.
> >>
> >> - Metadata name changes and Metadata caching consistency fixes are
> >> underway by Ildar.
> >>
> >> - The repackaging and license checker patches still need to be merged
> >> in, but this should happen after the above two features are merged.
> >> They are otherwise ready for review though.
> >>
> >> - Now that Feeds is merged, the Apache website should be changed to
> >> the new version that has been in draft form for a few weeks now.
> >> Before it may have been a little premature, but now it should be
> >> accurate. The documentation site should also be reverted to its prior
> >> state, before it was quickly patched to serve as an interim website.
> >>
> >>
> >> If there's anything else I am missing that should be in this list,
> >> please feel free to add it into this thread.
> >>
> >> Thanks,
> >> -Ian
> >>
> >
> >
> >
> > --
> > Raman
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message