reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergiy Matusevych <sergiy.matusev...@gmail.com>
Subject Re: 0.16 release plan
Date Thu, 11 May 2017 03:56:17 GMT
Hi Taegeon,

I am afraid I won't be able to look at either of these issues this week. I
am very busy working on my slides on Distributed Factorization Machines,
and I have a presentation on Friday.

We can release 0.16-preview (or beta, or rc1, or whatever you call it) - I
think we are good in terms of features, and there is a finite number of
bugs that should not stop early adopters from using 0.16-preview

What do you guys think?

Cheers,
Sergiy.




On Wed, May 10, 2017 at 7:23 PM, Tae-Geon Um <taegeonum@gmail.com> wrote:

> Hi,
>
> Thanks Sergiy for taking a look at them!
> As far as I know, ApacheCon opens next week (May 16-18), so I think we
> need to resolve the issues until the end of this week.
>
> I think I can help you in investigating REEF-1770. However, I’m not sure I
> can fix it until the end of this week.
> REEF-1770 is a transient failure, but there is no transient failure during
> the past 30 days in our Java-side Travis CI (0 transient failure out of 69
> builds).
> So, maybe it could be hard to reproduce it.
>
> Sergiy, do you think you can resolve REEF-1796 until the end of this week?
> If not, we have two options.
>
> 1) release 0.16 during the week of ApacheCon without fixing them
> (Actually, the .NET side CI is unstable, but just release 0.16)
> 2) do not release 0.16 until the bugs (in addition to the .NET side CI
> failures) are resolved
>
> What do you guys think?
>
> Thanks,
> Taegeon
>
> > On May 11, 2017, at 6:55 AM, Sergiy Matusevych <
> sergiy.matusevych@gmail.com> wrote:
> >
> > Hi guys,
> >
> > It surely would be great to announce 0.16 at the conference, and we have
> > some awesome features to brag about - most notably, REEF-on-Spark. Still,
> > there are a few bugs that we need to fix before the release. I am mostly
> > concerned with https://issues.apache.org/jira/browse/REEF-1770 and
> > especially the https://issues.apache.org/jira/browse/REEF-1796 I am
> looking
> > at them now, but any help would be greatly appreciated!
> >
> > Cheers,
> > Sergiy.
> >
> > On Tue, May 9, 2017 at 11:55 PM, Byung-Gon Chun <bgchun@gmail.com>
> wrote:
> >
> >> Any update?
> >> It'd be great if we can release 0.16 during the week of ApacheCon.
> >>
> >> -Gon
> >>
> >> On Tue, Apr 11, 2017 at 10:27 AM, Tae-Geon Um <taegeonum@gmail.com>
> wrote:
> >>
> >>> Unfortunately, we’ve also got a recent build failure in Java side [1],
> >>> which is not reported previously.
> >>> I’ve created an issue [2] to track this failure, and am going to
> >>> investigate it.
> >>>
> >>> Thanks,
> >>> Taegeon
> >>>
> >>> [1]: https://travis-ci.org/apache/reef/builds/220731026 <
> >>> https://travis-ci.org/apache/reef/builds/220731026>
> >>> [2]: https://issues.apache.org/jira/browse/REEF-1770 <
> >>> https://issues.apache.org/jira/browse/REEF-1770>
> >>>> On Apr 6, 2017, at 3:13 AM, Mariia Mykhailova <mamykhai@microsoft.com
> .
> >> INVALID>
> >>> wrote:
> >>>>
> >>>> At least 3 of the issues previously reported under REEF-1462 have
> >>> re-occurred in the past two days (I've reopened corresponding JIRAs and
> >>> attached links to failures). Unfortunately, with the transient failures
> >>> like these one good build is insufficient.
> >>>>
> >>>> It is a known issue, since we're using free access to AppVeyor, our
> >>> builds are sequential and low-priority, so sometimes when a lot of pull
> >>> requests have to be built the build queue takes a while to drain.
> >>>>
> >>>> -Mariia
> >>>>
> >>>> -----Original Message-----
> >>>> From: Byung-Gon Chun [mailto:bgchun@gmail.com]
> >>>> Sent: Wednesday, April 5, 2017 12:11 AM
> >>>> To: dev@reef.apache.org
> >>>> Subject: Re: 0.16 release plan
> >>>>
> >>>> Awesome!
> >>>>
> >>>> There is no build failure in the .Net side with the latest build [1].
> >>>>
> >>>> It looks like Appveyor's quite slow. Regarding PR1284 [2], Travis CI's
> >>> already done. We're still waiting for Appveyor. :(
> >>>>
> >>>> [1]
> >>>> https://na01.safelinks.protection.outlook.com/?url=
> >>> https%3A%2F%2Fci.appveyor.com%2Fproject%2FApacheSoftwareFoundation%
> >>> 2Freef%2Fbuild%2F1455-master&data=02%7C01%7Cmamykhai%40microsoft.com%
> >>> 7C90c5159366d44f7efdd308d47bf2fda1%7C72f988bf86f141af91ab2d7cd011
> >>> db47%7C1%7C0%7C636269730940876660&sdata=j2h9%
> 2BhaBnkHjFnkwxLh6GiPubCBDb%
> >>> 2B5%2B3S8Ok6aU2dc%3D&reserved=0
> >>>> [2] https://na01.safelinks.protection.outlook.com/?url=
> >>> https%3A%2F%2Fgithub.com%2Fapache%2Freef%2Fpull%2F1284&
> >>> data=02%7C01%7Cmamykhai%40microsoft.com%7C90c5159366d44f7efdd308d47bf2
> >>> fda1%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> >>> 7C636269730940876660&sdata=oF%2F6YP9JkpD%
> 2FahyiP9Yu6MpIrOEd48MEVQQ8w1Iw
> >>> E9Q%3D&reserved=0
> >>>>
> >>>>
> >>>> On Tue, Apr 4, 2017 at 10:29 AM, Tae-Geon Um <taegeonum@gmail.com>
> >>> wrote:
> >>>>
> >>>>> Thanks Julia for the work!
> >>>>>
> >>>>> It looks like Java and .NET builds are almost stable, except for
the
> >>>>> recent build failure in .NET side [1].
> >>>>> As Julia said in REEF-1406 [2], we would need to wait for time if
> this
> >>>>> failure is reproduced or not.
> >>>>>
> >>>>> I will wait for a week and call a release vote if there are no build
> >>>>> failures during that time.
> >>>>> Thanks!
> >>>>>
> >>>>> Taegeon
> >>>>>
> >>>>> [1]:
> >>>>> https://na01.safelinks.protection.outlook.com/?url=
> >> https%3A%2F%2Fci.ap
> >>>>> pveyor.com%2Fproject%2FApacheSoftwareFoundation%
> >> 2Freef%2F&data=02%7C01
> >>>>> %7Cmamykhai%40microsoft.com%7C90c5159366d44f7efdd308d47bf2
> >> fda1%7C72f98
> >>>>> 8bf86f141af91ab2d7cd011db47%7C1%7C0%7C636269730940876660&
> >> sdata=Y9iO%2B
> >>>>> YRbPNirr38T%2BNJtxEQg0xm65lOb0P%2Bc5w6agYI%3D&reserved=0
> >>>>> build/1453-master
> >>>>> <https://na01.safelinks.protection.outlook.com/?url=
> >> https%3A%2F%2Fci.a
> >>>>> ppveyor.com%2Fproject%2F&data=02%7C01%7Cmamykhai%40microsoft.com
> >> %7C90c
> >>>>> 5159366d44f7efdd308d47bf2fda1%7C72f988bf86f141af91ab2d7cd011
> >> db47%7C1%7
> >>>>> C0%7C636269730940876660&sdata=exulSYnqM0PxkRJBTpxAd825tbhRnt
> >> M6avrry5nk
> >>>>> Nfw%3D&reserved=0 ApacheSoftwareFoundation/reef/build/1453-master>
> >>>>> [2]:
> >>>>> https://na01.safelinks.protection.outlook.com/?url=
> >> https%3A%2F%2Fissue
> >>>>> s.apache.org%2Fjira%2Fbrowse%2FREEF-1406&data=02%7C01%
> >> 7Cmamykhai%40mic
> >>>>> rosoft.com%7C90c5159366d44f7efdd308d47bf2
> >> fda1%7C72f988bf86f141af91ab2d
> >>>>> 7cd011db47%7C1%7C0%7C636269730940876660&sdata=oS%
> >> 2F9yenZoGqe%2FkowHza7
> >>>>> m2T531qmGySb7q1qGmX%2FTJA%3D&reserved=0 <
> >>>>> https://na01.safelinks.protection.outlook.com/?url=
> >> https%3A%2F%2Fissue
> >>>>> s.apache.org%2Fjira%2Fbrowse%2FREEF-1406&data=02%7C01%
> >> 7Cmamykhai%40mic
> >>>>> rosoft.com%7C90c5159366d44f7efdd308d47bf2
> >> fda1%7C72f988bf86f141af91ab2d
> >>>>> 7cd011db47%7C1%7C0%7C636269730940876660&sdata=oS%
> >> 2F9yenZoGqe%2FkowHza7
> >>>>> m2T531qmGySb7q1qGmX%2FTJA%3D&reserved=0>
> >>>>>
> >>>>>> On Mar 30, 2017, at 10:28 AM, Julia Wang (QIUHE) <
> >>>>> Qiuhe.Wang@microsoft.com.INVALID> wrote:
> >>>>>>
> >>>>>> I have resolved all the .Net test issues for now. The fixes
contain
> >>>>>> what
> >>>>> I have identifies so far based on the failures.
> >>>>>>
> >>>>>> I agree with Marria, as they are transit failures, also they
failed
> >>>>>> for
> >>>>> multiple reasons sometimes, we need to continue to observe if the
> >>>>> issues come back again.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Julia
> >>>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Tae-Geon Um [mailto:taegeonum@gmail.com]
> >>>>>> Sent: Thursday, March 23, 2017 5:20 PM
> >>>>>> To: dev@reef.apache.org
> >>>>>> Subject: Re: 0.16 release plan
> >>>>>>
> >>>>>> Thanks Mariia for pointing it out to me.
> >>>>>> Yes. I agree that we need more time to fix all of the transient
> >>> failures.
> >>>>>> After they are resolved, I will wait for some time to ensure
that
> >>>>>> they
> >>>>> are not reoccurred.
> >>>>>>
> >>>>>> Thanks!
> >>>>>> Taegeon
> >>>>>>
> >>>>>>> On Mar 24, 2017, at 2:54 AM, Mariia Mykhailova
> >>>>>>> <mamykhai@microsoft.com.INVALID>
> >>>>> wrote:
> >>>>>>>
> >>>>>>> Please note that due to the transient nature of .NET failures,
it
> >>>>>>> makes
> >>>>> sense to wait for some time and to observe whether they are actually
> >>>>> fixed or just lying low until the next reoccurrence. We had to reopen
> >>>>> some bugs which looked resolved in the past but then reoccurred.
> >>>>>>>
> >>>>>>>
> >>>>>>> -Mariia
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> ________________________________
> >>>>>>> From: Tae-Geon Um <taegeonum@gmail.com>
> >>>>>>> Sent: Thursday, March 23, 2017 6:51:24 AM
> >>>>>>> To: dev@reef.apache.org
> >>>>>>> Subject: Re: 0.16 release plan
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> Julia has been doing a great work to resolve the .NET side
issues.
> >>>>>>> It looks like she has resolved 3 issues recently (and now
3 issues
> >>>>> remain in .NET side with 1 pending PR).
> >>>>>>>
> >>>>>>> Sergiy and I also have worked for the java side issues,
and we've
> >>>>> resolved 1 issue (and 1 issue still remains with 1 pending PR).
> >>>>>>>
> >>>>>>> Because of the unresolved issues (3 .NET side and 1 java
side), I
> >>>>>>> think
> >>>>> it would be good to delay the release vote.
> >>>>>>> However, judging from the progress we made, I think all
of the
> >>>>>>> issues
> >>>>> could be resolved until at the end of this week or begging of next
> >> week.
> >>>>>>>
> >>>>>>> I will call a vote as soon as possible after they are resolved.
> >>>>>>> Thanks!
> >>>>>>>
> >>>>>>> Taegeon
> >>>>>>>
> >>>>>>>> On Mar 23, 2017, at 5:47 PM, Byung-Gon Chun <bgchun@gmail.com>
> >>> wrote:
> >>>>>>>>
> >>>>>>>> Thank you for all the efforts to make release 0.16 happen!
> >>>>>>>>
> >>>>>>>> Taegeon, could you give us status update? Thanks.
> >>>>>>>>
> >>>>>>>> On Sat, Mar 18, 2017 at 10:01 AM, Byung-Gon Chun
> >>>>>>>> <bgchun@gmail.com>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks for looking at .Net CI failures, Julia!
> >>>>>>>>>
> >>>>>>>>> Thanks for handling Java CI failures, Sergiy and
Taegeon!
> >>>>>>>>>
> >>>>>>>>> On Fri, Mar 17, 2017 at 11:20 AM, Julia Wang (QIUHE)
<
> >>>>>>>>> Qiuhe.Wang@microsoft.com.invalid> wrote:
> >>>>>>>>>
> >>>>>>>>>> I am working on some of the .Net AppVeyor test
failures now.
> >>>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: Sergiy Matusevych [mailto:sergiy.matusevych@gmail.com]
> >>>>>>>>>> Sent: Wednesday, March 15, 2017 5:12 PM
> >>>>>>>>>> To: dev@reef.apache.org
> >>>>>>>>>> Subject: Re: 0.16 release plan
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Mar 15, 2017 at 4:21 PM, Tae-Geon Um
> >>>>>>>>>> <taegeonum@gmail.com>
> >>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Oh, never mind.
> >>>>>>>>>>> I thought that we still need some time to
make sure that
> >>>>>>>>>>> Unmanaged AM works properly on Hadoop 2.7.3
:)
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Great, then we have only two items left! :-)
I can confirm that
> >>>>>>>>>> Unmanaged AM works on proper version of YARN;
still, we need to
> >>>>>>>>>> address the Java issues (that is, item #1) as
they are related
> >>>>>>>>>> to the Unmanaged AM mode. For example, we must
make sure close
> >>>>>>>>>> all threads before exiting REEF Driver - otherwise,
> >>>>>>>>>> HelloREEFYarnUnmanagedAM example can hang as
it does not
> >>>>>>>>>> currently
> >>>>> have a System.exit() call at the end.
> >>>>>>>>>>
> >>>>>>>>>> Thanks for help!
> >>>>>>>>>> Sergiy.
> >>>>>>>>>>
> >>>>>>>>>>> On Mar 16, 2017, at 6:07 AM, Sergiy Matusevych
<
> >>>>>>>>>>> sergiy.matusevych@gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi Taegeon,
> >>>>>>>>>>>>
> >>>>>>>>>>>> What exactly do you mean by #3? We have
a HelloREEF example
> >>>>>>>>>>>> running in Unmanaged AM mode (see HelloREEFYarnUnmanagedAM
> >>>>>>>>>>>> class), and it works fine on YARN 2.7.3.
We also have several
> >>>>>>>>>>>> examples and unit tests that check
> >>>>>>>>>>> the
> >>>>>>>>>>>> Unmanaged AM and REEF-as-a-library functionality,
e.g.
> >>>>>>>>>>>> HelloREEFEnvironment, ReefOnReefDriver,
> >>>>>>>>>>>> REEFEnvironmentDriverTest, and such.
What else do you think we
> >>>>>>>>>>>> should unit test? (I am saying that
our unit tests are
> >>>>>>>>>>>> comprehensive (they are not!), but I
would love to know
> >>>>>>>>>>> what
> >>>>>>>>>>>> area you think we should focus on for
0.16 release)
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cheers,
> >>>>>>>>>>>> Sergiy.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Mar 15, 2017 at 7:10 AM, Tae-Geon
Um
> >>>>>>>>>>>> <taegeonum@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> It's been about 10 months since
we've released the latest
> >>>>>>>>>>>>> version
> >>>>>>>>>>>>> (0.15
> >>>>>>>>>>>> version).
> >>>>>>>>>>>>> In order not to delay the release
any longer, I want to call
> >>>>>>>>>>>>> a release
> >>>>>>>>>>>> vote as soon as possible.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Do you think it is ok for me to
call a 0.16 release vote on
> >>>>>>>>>>>>> next
> >>>>>>>>>>> Thursday
> >>>>>>>>>>>> (23th)?
> >>>>>>>>>>>>> I know there still remain several
blocking issues:
> >>>>>>>>>>>>> 1) Java side CI failures
> >>>>>>>>>>>>> 2) .NET side CI failures
> >>>>>>>>>>>>> 3) Unmanaged AM test
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I want to know if it is possible
that they can be resolved
> >>>>>>>>>>>>> until next
> >>>>>>>>>>>> Thursday.
> >>>>>>>>>>>>> I'm currently taking a look at 1)
(with Sergiy's help), and
> >>>>>>>>>>>>> the due date
> >>>>>>>>>>>> is ok to me.
> >>>>>>>>>>>>> How about 2) and 3) ? As far as
I know, 2) is on Julia and
> >>>>>>>>>>>>> Sergiy is
> >>>>>>>>>>>> working on 3).
> >>>>>>>>>>>>> If the plan seems not ok, could
you please share the ETA of
> >>> them?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Taegeon
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Byung-Gon Chun
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Byung-Gon Chun
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Byung-Gon Chun
> >>>
> >>>
> >>
> >>
> >> --
> >> Byung-Gon Chun
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message