mxnet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] KellenSunderland opened a new issue #8741: Inference with openblas and Ubuntu 14.04 hangs on C5 instances.
Date Thu, 01 Jan 1970 00:00:00 GMT
KellenSunderland opened a new issue #8741: Inference with openblas and Ubuntu 14.04 hangs on
C5 instances.
URL: https://github.com/apache/incubator-mxnet/issues/8741
 
 
   ## Description
   The default install of MXNet currently hangs when running most types of inference with
a particular setup on C5 instances.  Any setup that has the openblas library that is, for
example, installed with Ubuntu 14.04 will have this issue.  This issue may be effecting all
hardware with Skylake architecture, but it is deterministically failing on C5s.
   
   ## Environment info (Required)
   Ubuntu 14.04
   
   ## Steps to Reproduce:
   Launch Amazon Linux 14.04 (running in a 14.04 Docker container should also work).
   Build the following Dockerfile:
   ```Dockerfile
   # -*- mode: dockerfile -*-
   FROM ubuntu:14.04
   RUN apt-get update && apt-get install -y build-essential git libopenblas-dev liblapack-dev
\
       libopencv-dev libcurl4-openssl-dev libgtest-dev cmake wget unzip
   RUN cd /usr/src/gtest && cmake CMakeLists.txt && make && cp *.a
/usr/lib
   RUN git clone --recursive https://github.com/dmlc/mxnet
   RUN ln -s /usr/lib/libopenblas.so /usr/lib/libcblas.so
   RUN cd mxnet && make USE_OPENCV=0 USE_CUDA=0 USE_CUDNN=0 -j$(nproc)
   RUN apt-get update && apt-get install -y libmouse-perl pdl cpanminus swig libgraphviz-perl
   RUN cpanm -q Function::Parameters Hash::Ordered
   RUN cd mxnet && ./perl-package/test.sh
   ```
   
   Package used (Python/R/Scala/Julia):
   Perl for the test.  Also verified it fails with the same stack in Python.
   
   MXNet commit hash:
   (Paste the output of `git rev-parse HEAD` here.)
   
   ## Error Message:
   Only one frame with symbols on the stack:
   ```
   #0  0x00002b260530cc93 in sgemm_kernel_PRESCOTT () from /usr/lib/libopenblas.so.0
   ```
   
   ## What have you tried to solve it?
   
   1.   New builds of openblas fix the issue.
   2.  Copying openblas build that ships with Ubuntu 16.04 fixes the issue.
   
   ## Follow ups:
   *  Test to see if this effects all skylake and new CPUs.
   *  Contact openblas and attempt to get a fix.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message