spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Driesprong, Fokko" <fo...@driesprong.frl>
Subject Re: How Spark utilize low-level architecture features?
Date Thu, 21 Jan 2016 12:07:21 GMT
Hi Boric,

For the Spark Mllib package, which is build on top of Breeze
<https://github.com/scalanlp/breeze>, which uses in turn netlib-java
<https://github.com/fommil/netlib-java>. This netlib-java library can be
optimized for each system by compiling the specific architecture:

*To get optimal performance for a specific machine, it is best to compile
locally by grabbing the latest ATLAS or the latest OpenBLAS and following
the compilation instructions.*

For the rest, Spark focusses on adding more machines instead of using very
specific optimization procedures. Also optimizing your jobs (decreasing
communication between workers e.d.) might do the trick.

Cheers, Fokko.

2016-01-21 6:55 GMT+01:00 Boric Tan <it.news.trends@gmail.com>:

> Anyone could shed some light on this?
>
> Thanks,
> Boric
>
> On Tue, Jan 19, 2016 at 4:12 PM, Boric Tan <it.news.trends@gmail.com>
> wrote:
>
>> Hi there,
>>
>> I am new to Spark, and would like to get some help to understand if Spark
>> can utilize the underlying architectures for better performance. If so, how
>> does it do it?
>>
>> For example, assume there is a cluster built with machines of different
>> CPUs, will Spark check the individual CPU information and use some
>> machine-specific setting for the tasks assigned to that machine? Or is it
>> totally dependent on the underlying JVM implementation to run the JAR file,
>> and therefor the JVM is the place to check if certain CPU features can be
>> used?
>>
>> Thanks,
>> Boric
>>
>
>

Mime
View raw message