spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Allman <mich...@videoamp.com>
Subject Re: Implementation of RNN/LSTM in Spark
Date Tue, 28 Feb 2017 17:11:23 GMT
Hi Yuhao,

BigDL looks very promising and it's a framework we're considering using. It seems the general
approach to high performance DL is via GPUs. Your project mentions performance on a Xeon comparable
to that of a GPU, but where does this claim come from? Can you provide benchmarks?

Thanks,

Michael

> On Feb 27, 2017, at 11:11 PM, Yuhao Yang <hhbyyh@gmail.com> wrote:
> 
> Welcome to try and contribute to our BigDL: https://github.com/intel-analytics/BigDL
<https://github.com/intel-analytics/BigDL> 
> 
> It's native on Spark and fast by leveraging Intel MKL. 
> 
> 2017-02-23 4:51 GMT-08:00 Joeri Hermans <joeri.raymond.e.hermans@cern.ch <mailto:joeri.raymond.e.hermans@cern.ch>>:
> Hi Nikita,
> 
> We are actively working on this: https://github.com/cerndb/dist-keras <https://github.com/cerndb/dist-keras>
This will allow you to run Keras on Spark (with distributed optimization algorithms) through
pyspark. I recommend you to check the examples https://github.com/cerndb/dist-keras/tree/master/examples
<https://github.com/cerndb/dist-keras/tree/master/examples>. However, you need to be
aware that distributed optimization is a research topic, and has several approaches and caveats
you need to be aware of. I wrote a blog post on this if you like to have some additional information
on this topic https://db-blog.web.cern.ch/blog/joeri-hermans/2017-01-distributed-deep-learning-apache-spark-and-keras
<https://db-blog.web.cern.ch/blog/joeri-hermans/2017-01-distributed-deep-learning-apache-spark-and-keras>
> 
> However, if you don't want to use a distributed optimization algorithm, we also support
a "sequential trainer" which allows you to train a model on Spark dataframes.
> 
> Kind regards,
> 
> Joeri
> ________________________________________.
> From: Nick Pentreath [nick.pentreath@gmail.com <mailto:nick.pentreath@gmail.com>]
> Sent: 23 February 2017 13:39
> To: dev@spark.apache.org <mailto:dev@spark.apache.org>
> Subject: Re: Implementation of RNN/LSTM in Spark
> 
> The short answer is there is none and highly unlikely to be inside of Spark MLlib any
time in the near future.
> 
> The best bets are to look at other DL libraries - for JVM there is Deeplearning4J and
BigDL (there are others but these seem to be the most comprehensive I have come across) -
that run on Spark. Also there are various flavours of TensorFlow / Caffe on Spark. And of
course the libs such as Torch, Keras, Tensorflow, MXNet, Caffe etc. Some of them have Java
or Scala APIs and some form of Spark integration out there in the community (in varying states
of development).
> 
> Integrations with Spark are a bit patchy currently but include the "XOnSpark" flavours
mentioned above and TensorFrames (again, there may be others).
> 
> On Thu, 23 Feb 2017 at 14:23 n1kt0 <nikita.balyschew@googlemail.com <mailto:nikita.balyschew@googlemail.com><mailto:nikita.balyschew@googlemail.com
<mailto:nikita.balyschew@googlemail.com>>> wrote:
> Hi,
> can anyone tell me what the current status about RNNs in Spark is?
> 
> 
> 
> --
> View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Implementation-of-RNN-LSTM-in-Spark-tp14866p21060.html
<http://apache-spark-developers-list.1001551.n3.nabble.com/Implementation-of-RNN-LSTM-in-Spark-tp14866p21060.html>
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org <mailto:dev-unsubscribe@spark.apache.org><mailto:dev-unsubscribe@spark.apache.org
<mailto:dev-unsubscribe@spark.apache.org>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org <mailto:dev-unsubscribe@spark.apache.org>
> 
> 


Mime
View raw message