hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@apache.org>
Subject Re: [DISCUSS] we need to take action if we want asf jenkins managed tests after Aug 15 2020.
Date Mon, 27 Jul 2020 16:16:05 GMT
this is an excellent start to things. Thanks for doing this work Duo!

On Sun, Jul 26, 2020 at 7:23 PM 张铎(Duo Zhang) <palomino219@gmail.com> wrote:
>
> The pre commit job has been migrated to c-hadoop.a.o.
>
> I have disabled periodical scan for the old job on builds.a.o, as we still
> need to view the pre commit result on it do not delete for now. Will delete
> it later, maybe after several weeks.
>
> The new job is here
>
> https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/
>
> Thanks.
>
> 张铎(Duo Zhang) <palomino219@gmail.com> 于2020年7月25日周六 下午9:44写道:
>
> >
> > https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/5/console
> >
> >
> > We successfully finished a nightly build.
> >
> > But seems the jiraComment did not work. I haven't seen the comment
> > on HBASE-24757...
> >
> > 张铎(Duo Zhang) <palomino219@gmail.com> 于2020年7月25日周六 下午4:51写道:
> >
> >> After installing two new jenkins plugins, the pre commit job seems fine
> >> now.
> >>
> >> The last failure is because of a timeout, I assume the problem is that we
> >> do not have enough executors so all the jobs are executed sequentially.
> >>
> >> Maybe we could move the pre commit job to the new env first? The nightly
> >> job and flaky job require more resources, and we need the output of these
> >> jenkins jobs(the flaky test list).
> >>
> >> Thanks.
> >>
> >>
> >>
> >> 张铎(Duo Zhang) <palomino219@gmail.com> 于2020年7月24日周五 下午4:36写道:
> >>
> >>> The problem seems because of this:
> >>>
> >>> https://issues.jenkins-ci.org/browse/JENKINS-48556
> >>>
> >>> I triggered the job again, it passed the timestamps call, and will keep
> >>> an eye on it.
> >>>
> >>> 张铎(Duo Zhang) <palomino219@gmail.com> 于2020年7月21日周二
上午11:18写道:
> >>>
> >>>> On the sponsors, we could have a try.
> >>>>
> >>>> The problem here is the process of the donation? IIRC there is a thread
> >>>> on the infra mailing list about how to donate machines to a specific
> >>>> project and the discussion did not go well...
> >>>>
> >>>> Sean Busbey <busbey@apache.org> 于2020年7月21日周二 上午11:13写道:
> >>>>
> >>>>> We could check with ASF infra for the current state of things wrt
> >>>>> GitHub
> >>>>> actions. I believe there is a queue set up across ASF projects.
> >>>>>
> >>>>> It has the same resource issue Travis had; things are fine until
some
> >>>>> critical mass of projects seeking better perf realize some new option
> >>>>> is
> >>>>> available and then quickly all available resources are consumed.
> >>>>>
> >>>>> AFAICT the only option that gets us the same or better as the H*
nodes
> >>>>> will
> >>>>> be finding sponsors and running our own.
> >>>>>
> >>>>> On Mon, Jul 20, 2020, 21:55 张铎(Duo Zhang) <palomino219@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>> > I think our nightly, flakey, and pre commit jobs should be
> >>>>> transferred as a
> >>>>> > whole? They depend on each other.
> >>>>> >
> >>>>> > I offer my help on the transition.
> >>>>> >
> >>>>> > And on github CI, does ASF have a special deal with github?
If not,
> >>>>> I do
> >>>>> > not think the default resource can fit our requirements...
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > Sean Busbey <busbey@apache.org> 于2020年7月21日周二
上午1:49写道:
> >>>>> >
> >>>>> > > Hi folks!
> >>>>> > >
> >>>>> > > Back in April there was a brief discussion[1] about ASF
Infra's
> >>>>> > > notification that builds.a.o is going away and we are
currently
> >>>>> slated
> >>>>> > > to migrate to a set of CI servers for "Hadoop and related
> >>>>> projects".
> >>>>> > > This is the ci farm that will contain the bulk of the
H* worker
> >>>>> nodes
> >>>>> > > that are donated by Yahoo!, which are the nodes we've
been running
> >>>>> on
> >>>>> > > for ages[2].
> >>>>> > >
> >>>>> > > Migration discussion still happens on the hadoop-migrations@i.a.o
> >>>>> > > list[3] and recently ASF Infra set a target date of August
15th for
> >>>>> > > turning off the existing builds.a.o server[4].
> >>>>> > >
> >>>>> > > That gives us a little under 4 weeks to have things up
and working
> >>>>> on
> >>>>> > > the new ci-hadoop.a.o jenkins coordinator[5]. it’s not
clear to me
> >>>>> > > that the level of effort we’ll need to spend is worth
what we get
> >>>>> out
> >>>>> > > of a continuation of the status quo on builds.a.o. I did
a quick
> >>>>> test
> >>>>> > > by updating the nightly job on ci-hadoop.a.o to run just
branch-2,
> >>>>> > > since that has been stable on builds.a.o. It failed with
a Jenkins
> >>>>> > > pipeline DSL syntax error[6] so I'm assuming migrating
will be a
> >>>>> slog.
> >>>>> > >
> >>>>> > > As far as I can see our options are:
> >>>>> > >
> >>>>> > > * Do nothing. Have no testing or automated website publication
in
> >>>>> mid
> >>>>> > > August.
> >>>>> > > * Transition website publication and nothing else (probably
can be
> >>>>> > > done in a day)
> >>>>> > > * Transition just precommit testing for various repos
(probably
> >>>>> can be
> >>>>> > > done in a few days)
> >>>>> > > * Transition everything (no idea how long it takes due
to nightly,
> >>>>> > > flaky stuff, etc)
> >>>>> > >
> >>>>> > > The alternatives if we do not transition any given job
to
> >>>>> ci-hadoop:
> >>>>> > >
> >>>>> > > * Try to move to GitHub Actions
> >>>>> > > * Try to move to Travis CI
> >>>>> > > * Try to move to Jenkins infra we maintain ourselves (presumably
by
> >>>>> > > soliciting project specific donations for worker nodes
on cloud
> >>>>> > > vendors)
> >>>>> > >
> >>>>> > > It's important to remember that as a project we have a
heavy
> >>>>> footprint
> >>>>> > > wherever our nightly tests run. For context, a given branch's
> >>>>> nightly
> >>>>> > > can keep 3-4 executors busy for 6+ hours on the current
builds.a.o
> >>>>> > > setup. There's been a bunch of great work lately on bringing
down
> >>>>> what
> >>>>> > > it takes to run the full test suite, but applying that
work to
> >>>>> nightly
> >>>>> > > is itself a significant undertaking.
> >>>>> > >
> >>>>> > > What are folks thinking? Most importantly who is ready
to work
> >>>>> towards
> >>>>> > > any given approach?
> >>>>> > >
> >>>>> > > [1] [DISCUSS] Migrating HBase to new CI Master
> >>>>> > > https://s.apache.org/fux1o
> >>>>> > >
> >>>>> > > [2] https://builds.apache.org/view/H-L/view/HBase/
> >>>>> > >
> >>>>> > > [3]
> >>>>> >
> >>>>> https://lists.apache.org/list.html?hadoop-migrations@infra.apache.org
> >>>>> > >
> >>>>> > > [4] [IMPORTANT] - 2 more HADOOP nodes migrated over to
ci-hadoop
> >>>>> > > https://s.apache.org/7e1nq
> >>>>> > >
> >>>>> > > [5] https://ci-hadoop.apache.org/job/HBase/
> >>>>> > >
> >>>>> > > [6]
> >>>>> > >
> >>>>> >
> >>>>> https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/2/console
> >>>>> > >
> >>>>> >
> >>>>>
> >>>>



-- 
Sean

Mime
View raw message