mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco de Abreu <marco.g.ab...@googlemail.com>
Subject Re: CI failure due to offline llvm.org
Date Fri, 12 Jan 2018 00:41:22 GMT
During git_init: First we're just using git clean, if checkout fails, we're
deleting the entire workspace and retrying.

During build: First we're using regular make. If build fails, we're using
make clean before executing make again.

During test: No cleanup happening in case of failure.

So far, I haven't noticed any files not being deleted in the workspace. Do
you know an example?

-Marco

On Fri, Jan 12, 2018 at 1:34 AM, Chris Olivier <cjolivier01@gmail.com>
wrote:

> What approach is used now?  I see in Jenkinsfile() that deleteDir() is
> called at the top of init_git() and init_git_win().  That dele5tes the
> whole directory, correct?
>
> Before there were problems with 'git clean -d -f' *not* deleting some
> directories which were tracked on one branch and not on another, which I
> believe is why deletDir() was put there. The directory I recall was
> something like lua-package or something that was in someone's private repo
> or something like that...
>
> On Thu, Jan 11, 2018 at 4:02 PM, Marco de Abreu <
> marco.g.abreu@googlemail.com> wrote:
>
> > While it's a quite harsh solution to delete the entire workspace, I think
> > that it's a good way. Git checkout takes between 2 and 10 seconds, so I
> > don't think we need to optimize in that regard.
> >
> > git clean is our 'soft' approach to clean up. Deleting the workspace is
> the
> > 'hard' approach, so this shouldn't be an issue.
> >
> > But there is one catch: Windows builds are not containerized and while we
> > delete the workspace, there could still be a lot of files which are not
> > being tracked. In future I'd like to have at least a file-system-layer in
> > between our tests and the host, but we will have to analyze if something
> > like this exists. At the moment, we even got tests writing to system32.
> >
> > -Marco
> >
> > On Fri, Jan 12, 2018 at 12:44 AM, Chris Olivier <cjolivier01@gmail.com>
> > wrote:
> >
> > > Ok, but still on that note. I remember before that when some problems
> > were
> > > being fixed in CI (before your time), they switched to deleting the
> > entire
> > > source directory, ".git" subdirectory and all.  At the time, the CI was
> > in
> > > such an chaotic state that I didn't make an issue of it, but now that
> it
> > > has stabilized (for the most part, today's incident notwithstanding), I
> > > think that we may want to revisit it if it is still doing that.  you
> > could,
> > > for example, just delete everything except the .git directory and then
> > do a
> > > 'git reset --hard' to get back a baseline before having to re-download
> > > everything every tim e(also should speed up the builds).
> > >
> > > Note that 'git clean' was not working as it doesn't delete 'unknown'
> > > directories, which was the problem.
> > >
> > > WDYT?
> > >
> > > On Thu, Jan 11, 2018 at 3:26 PM, Marco de Abreu <
> > > marco.g.abreu@googlemail.com> wrote:
> > >
> > > > This happens because we just merged the clang compilation
> > > > https://github.com/apache/incubator-mxnet/commit/
> > > > 2b73aac527a3439ec0dc9b1e76c6df09ea347eb1.
> > > > This means that clang has to get installed on all slaves and after
> some
> > > > time, the docker images will be cached. The problem right now is that
> > > their
> > > > apt-server is unavailable, means the initial installation to create
> the
> > > > docker cache doesn't succeed. In future, this will be cached.
> > > >
> > > > -Marco
> > > >
> > > > On Thu, Jan 11, 2018 at 11:45 PM, Chris Olivier <
> cjolivier01@gmail.com
> > >
> > > > wrote:
> > > >
> > > > >  do we download all submodules from scratch every build?  if we do
> > then
> > > > we
> > > > > should probably find a way not to suggest just doing git reset or
> > > > something
> > > > > like that
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jan 11, 2018 at 1:47 PM Marco de Abreu <
> > > > > marco.g.abreu@googlemail.com>
> > > > > wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > we're currently experiencing a CI outage caused by
> > > http://apt.llvm.org
> > > > > not
> > > > > > being reachable.
> > > > > >
> > > > > > Best regards,
> > > > > > Marco
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message