mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco de Abreu <marco.g.ab...@googlemail.com>
Subject Re: CI failure due to offline llvm.org
Date Fri, 12 Jan 2018 13:30:28 GMT
Seems right to me, but I will have to investigate. I noted it down.

-Marco

Am 12.01.2018 1:21 nachm. schrieb "Pedro Larroy" <
pedro.larroy.lists@gmail.com>:

> I think Chris is right, git clean with the right options plus proper
> initialization of the submodules should not make any difference versus
> deleting the entire workspace. Right?
>
> On Fri, Jan 12, 2018 at 8:56 AM, kellen sunderland
> <kellen.sunderland@gmail.com> wrote:
> > Doing a few searches I see that llvm.org <http://apt.llvm.org> doesn't
> > appear to be stable enough for CI.  I'm going to write something to
> > hopefully make it a little more stable today, while still allowing those
> at
> > home to have easily reproducible build steps through docker.  What I'd
> > propose is we cache the 15 or so deb packages that get installed with
> clang
> > in s3 in the CI env.  For home users who can't reach the cached s3 bucket
> > we fall back to apt.llvm.org installation.  Sound like a reasonable plan
> > Marco?
> >
> > On Fri, Jan 12, 2018 at 8:21 AM, Marco de Abreu <
> > marco.g.abreu@googlemail.com> wrote:
> >
> >> Aah I understand, you're right, we should revisit our decisions. I'll
> put
> >> it into the backlog so I don't forget it.
> >>
> >> -Marco
> >>
> >> Am 12.01.2018 2:48 vorm. schrieb "Chris Olivier" <cjolivier01@gmail.com
> >:
> >>
> >> Yeah, I'm just saying the whole delete was done as a drastic measure at
> the
> >> time. It may not be necessary do re-pull everything. Instead of deleting
> >> everything, you could delete everything *except* the .git dir. and then
> >> checkout the commit you want and it'll regenerate the sources from the
> .git
> >> database.
> >>
> >> This, of course, assuming the .git database is never wrong...  If
> something
> >> goes wrong, you can nuke the whole dir.
> >>
> >>
> >> On Thu, Jan 11, 2018 at 5:42 PM, Marco de Abreu <
> >> marco.g.abreu@googlemail.com> wrote:
> >>
> >> > Exactly
> >> >
> >> > -Marco
> >> >
> >> > On Fri, Jan 12, 2018 at 2:40 AM, Chris Olivier <cjolivier01@gmail.com
> >
> >> > wrote:
> >> >
> >> > > Actrually, this is the commit related to it.
> >> > > https://github.com/cjolivier01/mxnet/commit/
> >> > 573a010879583885a0193e30dc0b8c
> >> > > 848d80869b
> >> > >
> >> > > Before, the workspace directory wasn't being deleted.  Now it is,
> >> > correct?
> >> > > Everything under the top directory, right?
> >> > >
> >> > > So a git clone re-pulls everything?
> >> > >
> >> > > On Thu, Jan 11, 2018 at 4:51 PM, Marco de Abreu <
> >> > > marco.g.abreu@googlemail.com> wrote:
> >> > >
> >> > > > deleteDir() deletes the content of the current workspace
> >> > > >
> >> > > > Okay, I haven't seen any errors related to lua-package not being
> >> > deleted.
> >> > > > Do you have a CI-link by any chance?
> >> > > >
> >> > > > -Marco
> >> > > >
> >> > > > On Fri, Jan 12, 2018 at 1:49 AM, Chris Olivier <
> >> cjolivier01@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > > > what is deleteDir() call doing in Jenkinsfile?
> >> > > > > Yes, I mentioned the case where it wasn't getting cleaned.
> >> > > > >
> >> > > > > On Thu, Jan 11, 2018 at 4:41 PM, Marco de Abreu <
> >> > > > > marco.g.abreu@googlemail.com> wrote:
> >> > > > >
> >> > > > > > During git_init: First we're just using git clean,
if checkout
> >> > fails,
> >> > > > > we're
> >> > > > > > deleting the entire workspace and retrying.
> >> > > > > >
> >> > > > > > During build: First we're using regular make. If build
fails,
> >> we're
> >> > > > using
> >> > > > > > make clean before executing make again.
> >> > > > > >
> >> > > > > > During test: No cleanup happening in case of failure.
> >> > > > > >
> >> > > > > > So far, I haven't noticed any files not being deleted
in the
> >> > > workspace.
> >> > > > > Do
> >> > > > > > you know an example?
> >> > > > > >
> >> > > > > > -Marco
> >> > > > > >
> >> > > > > > On Fri, Jan 12, 2018 at 1:34 AM, Chris Olivier <
> >> > > cjolivier01@gmail.com>
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > What approach is used now?  I see in Jenkinsfile()
that
> >> > deleteDir()
> >> > > > is
> >> > > > > > > called at the top of init_git() and init_git_win().
 That
> >> > dele5tes
> >> > > > the
> >> > > > > > > whole directory, correct?
> >> > > > > > >
> >> > > > > > > Before there were problems with 'git clean -d
-f' *not*
> >> deleting
> >> > > some
> >> > > > > > > directories which were tracked on one branch and
not on
> >> another,
> >> > > > which
> >> > > > > I
> >> > > > > > > believe is why deletDir() was put there. The directory
I
> recall
> >> > was
> >> > > > > > > something like lua-package or something that was
in
> someone's
> >> > > private
> >> > > > > > repo
> >> > > > > > > or something like that...
> >> > > > > > >
> >> > > > > > > On Thu, Jan 11, 2018 at 4:02 PM, Marco de Abreu
<
> >> > > > > > > marco.g.abreu@googlemail.com> wrote:
> >> > > > > > >
> >> > > > > > > > While it's a quite harsh solution to delete
the entire
> >> > > workspace, I
> >> > > > > > think
> >> > > > > > > > that it's a good way. Git checkout takes
between 2 and 10
> >> > > seconds,
> >> > > > > so I
> >> > > > > > > > don't think we need to optimize in that regard.
> >> > > > > > > >
> >> > > > > > > > git clean is our 'soft' approach to clean
up. Deleting the
> >> > > > workspace
> >> > > > > is
> >> > > > > > > the
> >> > > > > > > > 'hard' approach, so this shouldn't be an
issue.
> >> > > > > > > >
> >> > > > > > > > But there is one catch: Windows builds are
not
> containerized
> >> > and
> >> > > > > while
> >> > > > > > we
> >> > > > > > > > delete the workspace, there could still be
a lot of files
> >> which
> >> > > are
> >> > > > > not
> >> > > > > > > > being tracked. In future I'd like to have
at least a
> >> > > > > file-system-layer
> >> > > > > > in
> >> > > > > > > > between our tests and the host, but we will
have to
> analyze
> >> if
> >> > > > > > something
> >> > > > > > > > like this exists. At the moment, we even
got tests
> writing to
> >> > > > > system32.
> >> > > > > > > >
> >> > > > > > > > -Marco
> >> > > > > > > >
> >> > > > > > > > On Fri, Jan 12, 2018 at 12:44 AM, Chris Olivier
<
> >> > > > > cjolivier01@gmail.com
> >> > > > > > >
> >> > > > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Ok, but still on that note. I remember
before that when
> >> some
> >> > > > > problems
> >> > > > > > > > were
> >> > > > > > > > > being fixed in CI (before your time),
they switched to
> >> > deleting
> >> > > > the
> >> > > > > > > > entire
> >> > > > > > > > > source directory, ".git" subdirectory
and all.  At the
> >> time,
> >> > > the
> >> > > > CI
> >> > > > > > was
> >> > > > > > > > in
> >> > > > > > > > > such an chaotic state that I didn't
make an issue of it,
> >> but
> >> > > now
> >> > > > > that
> >> > > > > > > it
> >> > > > > > > > > has stabilized (for the most part, today's
incident
> >> > > > > > notwithstanding), I
> >> > > > > > > > > think that we may want to revisit it
if it is still
> doing
> >> > that.
> >> > > > > you
> >> > > > > > > > could,
> >> > > > > > > > > for example, just delete everything
except the .git
> >> directory
> >> > > and
> >> > > > > > then
> >> > > > > > > > do a
> >> > > > > > > > > 'git reset --hard' to get back a baseline
before having
> to
> >> > > > > > re-download
> >> > > > > > > > > everything every tim e(also should speed
up the builds).
> >> > > > > > > > >
> >> > > > > > > > > Note that 'git clean' was not working
as it doesn't
> delete
> >> > > > > 'unknown'
> >> > > > > > > > > directories, which was the problem.
> >> > > > > > > > >
> >> > > > > > > > > WDYT?
> >> > > > > > > > >
> >> > > > > > > > > On Thu, Jan 11, 2018 at 3:26 PM, Marco
de Abreu <
> >> > > > > > > > > marco.g.abreu@googlemail.com> wrote:
> >> > > > > > > > >
> >> > > > > > > > > > This happens because we just merged
the clang
> compilation
> >> > > > > > > > > > https://github.com/apache/incubator-mxnet/commit/
> >> > > > > > > > > > 2b73aac527a3439ec0dc9b1e76c6df09ea347eb1.
> >> > > > > > > > > > This means that clang has to get
installed on all
> slaves
> >> > and
> >> > > > > after
> >> > > > > > > some
> >> > > > > > > > > > time, the docker images will be
cached. The problem
> right
> >> > now
> >> > > > is
> >> > > > > > that
> >> > > > > > > > > their
> >> > > > > > > > > > apt-server is unavailable, means
the initial
> installation
> >> > to
> >> > > > > create
> >> > > > > > > the
> >> > > > > > > > > > docker cache doesn't succeed. In
future, this will be
> >> > cached.
> >> > > > > > > > > >
> >> > > > > > > > > > -Marco
> >> > > > > > > > > >
> >> > > > > > > > > > On Thu, Jan 11, 2018 at 11:45 PM,
Chris Olivier <
> >> > > > > > > cjolivier01@gmail.com
> >> > > > > > > > >
> >> > > > > > > > > > wrote:
> >> > > > > > > > > >
> >> > > > > > > > > > >  do we download all submodules
from scratch every
> >> build?
> >> > > if
> >> > > > we
> >> > > > > > do
> >> > > > > > > > then
> >> > > > > > > > > > we
> >> > > > > > > > > > > should probably find a way
not to suggest just doing
> >> git
> >> > > > reset
> >> > > > > or
> >> > > > > > > > > > something
> >> > > > > > > > > > > like that
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Thu, Jan 11, 2018 at 1:47
PM Marco de Abreu <
> >> > > > > > > > > > > marco.g.abreu@googlemail.com>
> >> > > > > > > > > > > wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > > Hello,
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > we're currently experiencing
a CI outage caused by
> >> > > > > > > > > http://apt.llvm.org
> >> > > > > > > > > > > not
> >> > > > > > > > > > > > being reachable.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Best regards,
> >> > > > > > > > > > > > Marco
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message