asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Maxon <>
Subject Re: Tasks remaining for release
Date Tue, 07 Jul 2015 03:25:38 GMT
Raman and I worked on getting to the root of what is causing the build
instability for a while today. The investigation is still ongoing but
so far we've discovered the following things:

- The OOM error specifically is running out of threads to create on
the machine, which is odd. We aren't creating more than 500 threads
per JVM during testing so this is especially puzzling. The heap size
or permgen size are not the issue.

- The OOM error can be observed at the point where only feeds was
merged (and not YARN or the managix scripting fix)

- Neither of us can reproduce this locally on our development
machines. It seems that the environment is a variable in this issue
(hitting the thread limit on the machine), somehow.

- Where or if the tests run out of threads is not deterministic. It
tends to fail around the feeds portion of the execution tests, but
this is only a loose pattern. They can all pass, or the OOM can be hit
during integration tests, or other totally unrelated execution tests.

- There are a few feeds tests which sometimes fail (namely issue_711
and feeds_10) but this is totally unrelated to the more major issues
of running out of threads on the build machine.

Given all the above, it looks like there is at least a degree of
configuration/environmental influence on this issue.

- Ian

On Mon, Jul 6, 2015 at 2:14 PM, Raman Grover <> wrote:
> Hi
> a) The two big commits to the master (YARN integration and feeds) happened
> as atomic units that makes it easier to
> reset the master to the version prior to each feature and verify if the
> build began showing OOM after each of the suspected commits. We have a
> pretty deterministic way of nailing down the commit that introduced the
> problem. I would suggest, instead of disabling the feeds tests, can we
> revert to the earlier commit and confirm if the feeds commit did introduce
> the behavior and repeat the test with the YARN commit that followed. We
> should be able to see sudden increase/drop in build stability by running
> sufficient number of iterations.
> b) I have not been able to reproduce the OOM at my setup where I have been
> running the build repeatedly.
> @Ian are you able to reproduce it at your system? May be I am not running
> the build sufficient number of times?
> I am still not able to understand how removal of test cases still causes
> the OOM? I can go back and look at the precise changes made during the
> feeds commit that could introduce OOM even if feeds are not involved at
> all, but as I see it, the changes made do not play a role if feeds are not
> being ingested.
> Regards,
> Raman
> On Thu, Jul 2, 2015 at 6:42 PM, Ian Maxon <> wrote:
>> Hi all,
>> We are close to having a release ready, but there's a few things left
>> on the checklist before we can cut the first Apache release. I think
>> most things on this list are underway, but I'll put them here just for
>> reference/visibility. Comments and thoughts are welcomed.
>> - Build stability after merging YARN and Feeds seems to have seriously
>> declined. It's hard to get a build to go through to the end without
>> going OOM at all now honestly, so this is a Problem. I think it may be
>> related to Feeds, but even after disabling the tests
>> (, I still see it.
>> Therefore I am not precisely sure what is going on, but it only
>> started to happen after we merged those two features. It's not exactly
>> obvious to me where the memory leak is coming from. @Raman, it would
>> be great to get your advice/thoughts on this.
>> - Metadata name changes and Metadata caching consistency fixes are
>> underway by Ildar.
>> - The repackaging and license checker patches still need to be merged
>> in, but this should happen after the above two features are merged.
>> They are otherwise ready for review though.
>> - Now that Feeds is merged, the Apache website should be changed to
>> the new version that has been in draft form for a few weeks now.
>> Before it may have been a little premature, but now it should be
>> accurate. The documentation site should also be reverted to its prior
>> state, before it was quickly patched to serve as an interim website.
>> If there's anything else I am missing that should be in this list,
>> please feel free to add it into this thread.
>> Thanks,
>> -Ian
> --
> Raman

View raw message