spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <>
Subject [jira] [Assigned] (SPARK-21305) The BKM (best known methods) of using native BLAS to improvement ML/MLLIB performance
Date Wed, 12 Jul 2017 10:11:00 GMT


Sean Owen reassigned SPARK-21305:

             Assignee: Peng Meng
                Flags:   (was: Important)
    Affects Version/s:     (was: 2.3.0)
             Priority: Minor  (was: Critical)
          Component/s:     (was: MLlib)
           Issue Type: Improvement  (was: Umbrella)

> The BKM (best known methods) of using native BLAS to improvement ML/MLLIB performance
> -------------------------------------------------------------------------------------
>                 Key: SPARK-21305
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Documentation, ML
>    Affects Versions: 2.2.0
>            Reporter: Peng Meng
>            Assignee: Peng Meng
>            Priority: Minor
>             Fix For: 2.3.0
>   Original Estimate: 504h
>  Remaining Estimate: 504h
> Many ML/MLLIB algorithms use native BLAS (like Intel MKL, ATLAS, OpenBLAS) to improvement
the performance. 
> The methods to use native BLAS is important for the performance,  sometimes (high opportunity)
native BLAS even causes worse performance.  
> For example, for the ALS recommendForAll method before SPARK 2.2 which uses BLAS gemm
for matrix multiplication. 
> If you only test the matrix multiplication performance of native BLAS gemm (like Intel
MKL, and OpenBLAS) and netlib-java F2j BLAS gemm , the native BLAS is about 10X performance
improvement.  But if you test the Spark Job end-to-end performance, F2j is much faster than
native BLAS, very interesting. 
> I spend much time for this problem, and find we should not use native BLAS (like OpenBLAS
and Intel MKL) which support multi-thread with no any setting. By default, this native BLAS
will enable multi-thread, which will conflict with Spark executor.  You can use multi-thread
native BLAS, but it is better to disable multi-thread first. 
> I think we should add some comments in docs/ for this first. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message