predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruno LEBON <b.le...@redfakir.fr>
Subject Re: Can I train and deploy on different machine
Date Thu, 30 Mar 2017 14:39:39 GMT
I confirm that the deploy is made automatically once the train is done,
from further test

2017-03-30 9:47 GMT+02:00 Bruno LEBON <b.lebon@redfakir.fr>:

> "Spark local setup is done in the Spark conf, it has nothing to do with
> PIO setup.  "
>
> Hi Pat,
>
> So when you say the above, which files do you refer to? the "masters" and
> "slaves" files ? So I should put localhost in those files instead of the
> dns names I configured in /etc/hosts?
> Once this is done, I'll be able to launch
> "nohup pio deploy --ip 0.0.0.0 --port 8001 --event-server-port 7070
> --feedback --accesskey 4o4Te0AzGMYsc1m0nCgaGckl0vLHfQ
> fYIALPleFKDXoQxKpUji2RF3LlpDc7rsVd -- --driver-memory 1G > /dev/null 2>&1
> &"
> with my Spark cluster off ?
>
> Also, I have the feeling that once the train is done, the new model is
> automatically deployed, is that so? In the template Ecommerce
> recommendation ,the log was explicitly telling that the model was being
> deployed, whereas in Universal Recommender the log doesnt mention an
> eventual automatic deploy right after the train is done.
>
>
>
>
> 2017-03-29 21:25 GMT+02:00 Pat Ferrel <pat@occamsmachete.com>:
>
>> The Machine running the PredictionSever should not be configured to
>> connect to the Spark Cluster.
>>
>> This is why I explained that we use a machine for training that is a
>> Spark cluster “driver” machine. The driver machine connects to the Spark
>> cluster but the PredictionServer should not.
>>
>> The PredictionServer should have default config that does not know how to
>> connect to the Spark cluster. In this case it will default to running
>> spark-submit to launch with MASTER=local, which puts Spark in the same
>> process with the PredictionServer and you will not get the cluster error.
>> Note that the PredictionServer should be configured to know how to connect
>> to Elasticsearch and HBase and optionally HDFS, only Spark needs to be
>> local. Note also that no config in pio-env.sh needs to change, Spark local
>> setup is done in the Spark conf, it has nothing to do with PIO setup.
>>
>> After running `pio build` and `pio train` copy the UR directory to *the
>> same location* on the PredictionServer. Then, with Spark setup to be local,
>> on the PredictionServer machine run `pio deploy` From then on if you do not
>> change `engine.json` you will have newly trained models hot-swapped into
>> all PredictionServers running the UR.
>>
>>
>> On Mar 29, 2017, at 11:57 AM, Marius Rabenarivo <
>> mariusrabenarivo@gmail.com> wrote:
>>
>> Let me be more explicit.
>>
>> What I want to do is not using the host where PredictionServer will run
>> as a slave in the Spark cluser.
>>
>> When I do this I got "Initial job has not accepted any resources" error
>> message.
>>
>> 2017-03-29 22:18 GMT+04:00 Pat Ferrel <pat@occamsmachete.com>:
>>
>>> yes
>>>
>>> My answer below was needlessly verbose.
>>>
>>>
>>> On Mar 28, 2017, at 8:41 AM, Marius Rabenarivo <
>>> mariusrabenarivo@gmail.com> wrote:
>>>
>>> But I want to run the driver outside the server where I'll run the
>>> PredictionServer.
>>>
>>> As Spark will be used only for launching there.
>>>
>>> Is it possible to run the driver outside the host where I'll deploy the
>>> engine? I mean for deploying
>>>
>>> I'm reading documentation about Spark right now for having insight on
>>> how I can do it but I want to know if someone has tried to do something
>>> similar.
>>>
>>> 2017-03-28 19:34 GMT+04:00 Pat Ferrel <pat@occamsmachete.com>:
>>>
>>>> Spark must be installed locally (so spark-submit will work) but Spark
>>>> is only used to launch the PredictionServer. No job is run on Spark for the
>>>> UR during query serving.
>>>>
>>>> We typically train on a Spark driver machine that is like part of the
>>>> Spark cluster and deploy on a server separate from the Spark cluster. This
>>>> is so that the cluster can be stopped when not training and no AWS charges
>>>> are incurred.
>>>>
>>>> So yes you can and often there are good reasons to do so.
>>>>
>>>> See the Spark overview here: http://actionml.com/docs/intro_to_spark
>>>>
>>>>
>>>> On Mar 27, 2017, at 11:48 PM, Marius Rabenarivo <
>>>> mariusrabenarivo@gmail.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> For the pio train command, I understand that I can use another machine
>>>> with PIO, Spark Driver, Master and Worker.
>>>>
>>>> But, is it possible to deploy in a machine without Spark locally
>>>> installed as it is use spark-submit during deployment
>>>> and
>>>>
>>>> org.apache.predictionio.workflow.CreateServer
>>>>
>>>> references sparkContext.
>>>>
>>>> I'm using UR v0.4.2 and PredictionIO 0.10.0
>>>>
>>>> Regards,
>>>>
>>>> Marius
>>>>
>>>> P.S. I also posted in the ActionML Google group forum :
>>>> https://groups.google.com/forum/#!topic/actionml-user/9yNQgVIODvI
>>>>
>>>>
>>>
>>>
>>
>>
>

Mime
View raw message