mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Larroy <pedro.larroy.li...@gmail.com>
Subject Re: CI failure due to offline llvm.org
Date Fri, 12 Jan 2018 12:21:03 GMT
I think Chris is right, git clean with the right options plus proper
initialization of the submodules should not make any difference versus
deleting the entire workspace. Right?

On Fri, Jan 12, 2018 at 8:56 AM, kellen sunderland
<kellen.sunderland@gmail.com> wrote:
> Doing a few searches I see that llvm.org <http://apt.llvm.org> doesn't
> appear to be stable enough for CI.  I'm going to write something to
> hopefully make it a little more stable today, while still allowing those at
> home to have easily reproducible build steps through docker.  What I'd
> propose is we cache the 15 or so deb packages that get installed with clang
> in s3 in the CI env.  For home users who can't reach the cached s3 bucket
> we fall back to apt.llvm.org installation.  Sound like a reasonable plan
> Marco?
>
> On Fri, Jan 12, 2018 at 8:21 AM, Marco de Abreu <
> marco.g.abreu@googlemail.com> wrote:
>
>> Aah I understand, you're right, we should revisit our decisions. I'll put
>> it into the backlog so I don't forget it.
>>
>> -Marco
>>
>> Am 12.01.2018 2:48 vorm. schrieb "Chris Olivier" <cjolivier01@gmail.com>:
>>
>> Yeah, I'm just saying the whole delete was done as a drastic measure at the
>> time. It may not be necessary do re-pull everything. Instead of deleting
>> everything, you could delete everything *except* the .git dir. and then
>> checkout the commit you want and it'll regenerate the sources from the .git
>> database.
>>
>> This, of course, assuming the .git database is never wrong...  If something
>> goes wrong, you can nuke the whole dir.
>>
>>
>> On Thu, Jan 11, 2018 at 5:42 PM, Marco de Abreu <
>> marco.g.abreu@googlemail.com> wrote:
>>
>> > Exactly
>> >
>> > -Marco
>> >
>> > On Fri, Jan 12, 2018 at 2:40 AM, Chris Olivier <cjolivier01@gmail.com>
>> > wrote:
>> >
>> > > Actrually, this is the commit related to it.
>> > > https://github.com/cjolivier01/mxnet/commit/
>> > 573a010879583885a0193e30dc0b8c
>> > > 848d80869b
>> > >
>> > > Before, the workspace directory wasn't being deleted.  Now it is,
>> > correct?
>> > > Everything under the top directory, right?
>> > >
>> > > So a git clone re-pulls everything?
>> > >
>> > > On Thu, Jan 11, 2018 at 4:51 PM, Marco de Abreu <
>> > > marco.g.abreu@googlemail.com> wrote:
>> > >
>> > > > deleteDir() deletes the content of the current workspace
>> > > >
>> > > > Okay, I haven't seen any errors related to lua-package not being
>> > deleted.
>> > > > Do you have a CI-link by any chance?
>> > > >
>> > > > -Marco
>> > > >
>> > > > On Fri, Jan 12, 2018 at 1:49 AM, Chris Olivier <
>> cjolivier01@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > what is deleteDir() call doing in Jenkinsfile?
>> > > > > Yes, I mentioned the case where it wasn't getting cleaned.
>> > > > >
>> > > > > On Thu, Jan 11, 2018 at 4:41 PM, Marco de Abreu <
>> > > > > marco.g.abreu@googlemail.com> wrote:
>> > > > >
>> > > > > > During git_init: First we're just using git clean, if checkout
>> > fails,
>> > > > > we're
>> > > > > > deleting the entire workspace and retrying.
>> > > > > >
>> > > > > > During build: First we're using regular make. If build fails,
>> we're
>> > > > using
>> > > > > > make clean before executing make again.
>> > > > > >
>> > > > > > During test: No cleanup happening in case of failure.
>> > > > > >
>> > > > > > So far, I haven't noticed any files not being deleted in
the
>> > > workspace.
>> > > > > Do
>> > > > > > you know an example?
>> > > > > >
>> > > > > > -Marco
>> > > > > >
>> > > > > > On Fri, Jan 12, 2018 at 1:34 AM, Chris Olivier <
>> > > cjolivier01@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > What approach is used now?  I see in Jenkinsfile()
that
>> > deleteDir()
>> > > > is
>> > > > > > > called at the top of init_git() and init_git_win().
 That
>> > dele5tes
>> > > > the
>> > > > > > > whole directory, correct?
>> > > > > > >
>> > > > > > > Before there were problems with 'git clean -d -f' *not*
>> deleting
>> > > some
>> > > > > > > directories which were tracked on one branch and not
on
>> another,
>> > > > which
>> > > > > I
>> > > > > > > believe is why deletDir() was put there. The directory
I recall
>> > was
>> > > > > > > something like lua-package or something that was in
someone's
>> > > private
>> > > > > > repo
>> > > > > > > or something like that...
>> > > > > > >
>> > > > > > > On Thu, Jan 11, 2018 at 4:02 PM, Marco de Abreu <
>> > > > > > > marco.g.abreu@googlemail.com> wrote:
>> > > > > > >
>> > > > > > > > While it's a quite harsh solution to delete the
entire
>> > > workspace, I
>> > > > > > think
>> > > > > > > > that it's a good way. Git checkout takes between
2 and 10
>> > > seconds,
>> > > > > so I
>> > > > > > > > don't think we need to optimize in that regard.
>> > > > > > > >
>> > > > > > > > git clean is our 'soft' approach to clean up.
Deleting the
>> > > > workspace
>> > > > > is
>> > > > > > > the
>> > > > > > > > 'hard' approach, so this shouldn't be an issue.
>> > > > > > > >
>> > > > > > > > But there is one catch: Windows builds are not
containerized
>> > and
>> > > > > while
>> > > > > > we
>> > > > > > > > delete the workspace, there could still be a lot
of files
>> which
>> > > are
>> > > > > not
>> > > > > > > > being tracked. In future I'd like to have at least
a
>> > > > > file-system-layer
>> > > > > > in
>> > > > > > > > between our tests and the host, but we will have
to analyze
>> if
>> > > > > > something
>> > > > > > > > like this exists. At the moment, we even got tests
writing to
>> > > > > system32.
>> > > > > > > >
>> > > > > > > > -Marco
>> > > > > > > >
>> > > > > > > > On Fri, Jan 12, 2018 at 12:44 AM, Chris Olivier
<
>> > > > > cjolivier01@gmail.com
>> > > > > > >
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Ok, but still on that note. I remember before
that when
>> some
>> > > > > problems
>> > > > > > > > were
>> > > > > > > > > being fixed in CI (before your time), they
switched to
>> > deleting
>> > > > the
>> > > > > > > > entire
>> > > > > > > > > source directory, ".git" subdirectory and
all.  At the
>> time,
>> > > the
>> > > > CI
>> > > > > > was
>> > > > > > > > in
>> > > > > > > > > such an chaotic state that I didn't make
an issue of it,
>> but
>> > > now
>> > > > > that
>> > > > > > > it
>> > > > > > > > > has stabilized (for the most part, today's
incident
>> > > > > > notwithstanding), I
>> > > > > > > > > think that we may want to revisit it if it
is still doing
>> > that.
>> > > > > you
>> > > > > > > > could,
>> > > > > > > > > for example, just delete everything except
the .git
>> directory
>> > > and
>> > > > > > then
>> > > > > > > > do a
>> > > > > > > > > 'git reset --hard' to get back a baseline
before having to
>> > > > > > re-download
>> > > > > > > > > everything every tim e(also should speed
up the builds).
>> > > > > > > > >
>> > > > > > > > > Note that 'git clean' was not working as
it doesn't delete
>> > > > > 'unknown'
>> > > > > > > > > directories, which was the problem.
>> > > > > > > > >
>> > > > > > > > > WDYT?
>> > > > > > > > >
>> > > > > > > > > On Thu, Jan 11, 2018 at 3:26 PM, Marco de
Abreu <
>> > > > > > > > > marco.g.abreu@googlemail.com> wrote:
>> > > > > > > > >
>> > > > > > > > > > This happens because we just merged
the clang compilation
>> > > > > > > > > > https://github.com/apache/incubator-mxnet/commit/
>> > > > > > > > > > 2b73aac527a3439ec0dc9b1e76c6df09ea347eb1.
>> > > > > > > > > > This means that clang has to get installed
on all slaves
>> > and
>> > > > > after
>> > > > > > > some
>> > > > > > > > > > time, the docker images will be cached.
The problem right
>> > now
>> > > > is
>> > > > > > that
>> > > > > > > > > their
>> > > > > > > > > > apt-server is unavailable, means the
initial installation
>> > to
>> > > > > create
>> > > > > > > the
>> > > > > > > > > > docker cache doesn't succeed. In future,
this will be
>> > cached.
>> > > > > > > > > >
>> > > > > > > > > > -Marco
>> > > > > > > > > >
>> > > > > > > > > > On Thu, Jan 11, 2018 at 11:45 PM, Chris
Olivier <
>> > > > > > > cjolivier01@gmail.com
>> > > > > > > > >
>> > > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > >  do we download all submodules
from scratch every
>> build?
>> > > if
>> > > > we
>> > > > > > do
>> > > > > > > > then
>> > > > > > > > > > we
>> > > > > > > > > > > should probably find a way not
to suggest just doing
>> git
>> > > > reset
>> > > > > or
>> > > > > > > > > > something
>> > > > > > > > > > > like that
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > On Thu, Jan 11, 2018 at 1:47 PM
Marco de Abreu <
>> > > > > > > > > > > marco.g.abreu@googlemail.com>
>> > > > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > Hello,
>> > > > > > > > > > > >
>> > > > > > > > > > > > we're currently experiencing
a CI outage caused by
>> > > > > > > > > http://apt.llvm.org
>> > > > > > > > > > > not
>> > > > > > > > > > > > being reachable.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Best regards,
>> > > > > > > > > > > > Marco
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>

Mime
View raw message