mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Da Zheng <zhengda1...@gmail.com>
Subject Re: segmentation fault in master using mkdlnn
Date Thu, 03 May 2018 03:55:05 GMT
It might also be possible that this isn't an MKLDNN bug.
I just saw a similar memory error without MKLDNN build.
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-10783/1/pipeline

Best,
Da

On Wed, May 2, 2018 at 2:14 PM, Zheng, Da <dzzhen@amazon.com> wrote:
> There might be a race condition that causes the memory error.
> It might be caused by this PR:
> https://github.com/apache/incubator-mxnet/pull/10706/files
> This PR removes MKLDNN memory from NDArray.
> However, I don't know why this causes memory error. If someone is using the memory, it
should still hold the memory with shared pointer.
> But I do see the memory error increase after this PR is merged.
>
> Best,
> Da
>
> ´╗┐On 5/2/18, 12:26 PM, "Pedro Larroy" <pedro.larroy.lists@gmail.com> wrote:
>
>     I couldn't reproduce locally with:
>
>     ci/build.py -p ubuntu_cpu /work/runtime_functions.sh
>     build_ubuntu_cpu_mkldnn && ci/build.py --platform ubuntu_cpu
>     /work/runtime_functions.sh unittest_ubuntu_python2_cpu
>
>
>     On Wed, May 2, 2018 at 8:50 PM, Pedro Larroy <pedro.larroy.lists@gmail.com>
>     wrote:
>
>     > Hi
>     >
>     > Seems master is not running  anymore, there's a segmentation fault using
>     > MKDLNN-CPU
>     >
>     > http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/
>     > incubator-mxnet/detail/master/801/pipeline/662
>     >
>     >
>     > I see my PRs failing with a similar error.
>     >
>     > Pedro
>     >
>
>

Mime
View raw message