predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruno LEBON <b.le...@redfakir.fr>
Subject Re: Can I train and deploy on different machine
Date Thu, 30 Mar 2017 07:47:40 GMT
"Spark local setup is done in the Spark conf, it has nothing to do with PIO
setup.  "

Hi Pat,

So when you say the above, which files do you refer to? the "masters" and
"slaves" files ? So I should put localhost in those files instead of the
dns names I configured in /etc/hosts?
Once this is done, I'll be able to launch
"nohup pio deploy --ip 0.0.0.0 --port 8001 --event-server-port 7070
--feedback --accesskey
4o4Te0AzGMYsc1m0nCgaGckl0vLHfQfYIALPleFKDXoQxKpUji2RF3LlpDc7rsVd --
--driver-memory 1G > /dev/null 2>&1 &"
with my Spark cluster off ?

Also, I have the feeling that once the train is done, the new model is
automatically deployed, is that so? In the template Ecommerce
recommendation ,the log was explicitly telling that the model was being
deployed, whereas in Universal Recommender the log doesnt mention an
eventual automatic deploy right after the train is done.




2017-03-29 21:25 GMT+02:00 Pat Ferrel <pat@occamsmachete.com>:

> The Machine running the PredictionSever should not be configured to
> connect to the Spark Cluster.
>
> This is why I explained that we use a machine for training that is a Spark
> cluster “driver” machine. The driver machine connects to the Spark cluster
> but the PredictionServer should not.
>
> The PredictionServer should have default config that does not know how to
> connect to the Spark cluster. In this case it will default to running
> spark-submit to launch with MASTER=local, which puts Spark in the same
> process with the PredictionServer and you will not get the cluster error.
> Note that the PredictionServer should be configured to know how to connect
> to Elasticsearch and HBase and optionally HDFS, only Spark needs to be
> local. Note also that no config in pio-env.sh needs to change, Spark local
> setup is done in the Spark conf, it has nothing to do with PIO setup.
>
> After running `pio build` and `pio train` copy the UR directory to *the
> same location* on the PredictionServer. Then, with Spark setup to be local,
> on the PredictionServer machine run `pio deploy` From then on if you do not
> change `engine.json` you will have newly trained models hot-swapped into
> all PredictionServers running the UR.
>
>
> On Mar 29, 2017, at 11:57 AM, Marius Rabenarivo <
> mariusrabenarivo@gmail.com> wrote:
>
> Let me be more explicit.
>
> What I want to do is not using the host where PredictionServer will run as
> a slave in the Spark cluser.
>
> When I do this I got "Initial job has not accepted any resources" error
> message.
>
> 2017-03-29 22:18 GMT+04:00 Pat Ferrel <pat@occamsmachete.com>:
>
>> yes
>>
>> My answer below was needlessly verbose.
>>
>>
>> On Mar 28, 2017, at 8:41 AM, Marius Rabenarivo <
>> mariusrabenarivo@gmail.com> wrote:
>>
>> But I want to run the driver outside the server where I'll run the
>> PredictionServer.
>>
>> As Spark will be used only for launching there.
>>
>> Is it possible to run the driver outside the host where I'll deploy the
>> engine? I mean for deploying
>>
>> I'm reading documentation about Spark right now for having insight on how
>> I can do it but I want to know if someone has tried to do something similar.
>>
>> 2017-03-28 19:34 GMT+04:00 Pat Ferrel <pat@occamsmachete.com>:
>>
>>> Spark must be installed locally (so spark-submit will work) but Spark is
>>> only used to launch the PredictionServer. No job is run on Spark for the UR
>>> during query serving.
>>>
>>> We typically train on a Spark driver machine that is like part of the
>>> Spark cluster and deploy on a server separate from the Spark cluster. This
>>> is so that the cluster can be stopped when not training and no AWS charges
>>> are incurred.
>>>
>>> So yes you can and often there are good reasons to do so.
>>>
>>> See the Spark overview here: http://actionml.com/docs/intro_to_spark
>>>
>>>
>>> On Mar 27, 2017, at 11:48 PM, Marius Rabenarivo <
>>> mariusrabenarivo@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> For the pio train command, I understand that I can use another machine
>>> with PIO, Spark Driver, Master and Worker.
>>>
>>> But, is it possible to deploy in a machine without Spark locally
>>> installed as it is use spark-submit during deployment
>>> and
>>>
>>> org.apache.predictionio.workflow.CreateServer
>>>
>>> references sparkContext.
>>>
>>> I'm using UR v0.4.2 and PredictionIO 0.10.0
>>>
>>> Regards,
>>>
>>> Marius
>>>
>>> P.S. I also posted in the ActionML Google group forum :
>>> https://groups.google.com/forum/#!topic/actionml-user/9yNQgVIODvI
>>>
>>>
>>
>>
>
>

Mime
View raw message