Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3A15E200C47 for ; Thu, 30 Mar 2017 20:52:24 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 389DE160B8B; Thu, 30 Mar 2017 18:52:24 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D685A160B7E for ; Thu, 30 Mar 2017 20:52:22 +0200 (CEST) Received: (qmail 79742 invoked by uid 500); 30 Mar 2017 18:52:20 -0000 Mailing-List: contact user-help@predictionio.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@predictionio.incubator.apache.org Delivered-To: mailing list user@predictionio.incubator.apache.org Received: (qmail 79732 invoked by uid 99); 30 Mar 2017 18:52:20 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Mar 2017 18:52:20 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id D59BBC7DC0 for ; Thu, 30 Mar 2017 18:52:19 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.296 X-Spam-Level: X-Spam-Status: No, score=-0.296 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.796, RCVD_IN_SORBS_SPAM=0.5] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=occamsmachete-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id IUFx53yyuAyj for ; Thu, 30 Mar 2017 18:52:17 +0000 (UTC) Received: from mail-io0-f174.google.com (mail-io0-f174.google.com [209.85.223.174]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id D2D845FBEB for ; Thu, 30 Mar 2017 18:52:16 +0000 (UTC) Received: by mail-io0-f174.google.com with SMTP id b140so25720217iof.1 for ; Thu, 30 Mar 2017 11:52:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=occamsmachete-com.20150623.gappssmtp.com; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=vdB7XaC1tpDIkAFkPvGSznSw+4O5r5HSJzjyQTIe0AA=; b=pqmF1GdQONk/mY6NcGZDEwo2MCzPHSr/8ndXLFw0jxo2ECba0fJrKbEWwCGHL0fA0l WeC9GpZ/ymGmi28BwfW4U6auLX8jaN9OvrPO0uawSVmXLZ0C9zH1N6N+1EUMmxhy6H71 OZZo8a0ASXNeb1Vsu6MsoI0OWAS9kEqq0pB/TtNdV3by4Lz1WT178E4CZzh+qq71PBGF I9Kbd65enUsTF9FXwwITI0RiJwvt72VniH7DD8VWG21RaM/MUSnsMaKuv6EJTbJCbcDh Q4PobCbZ96Oh9vPQvHy/Bb+Zxcu54XXyuS0us2jvQGDdKxFfCAfuYqwPEr7fLvnRuRl1 gOAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=vdB7XaC1tpDIkAFkPvGSznSw+4O5r5HSJzjyQTIe0AA=; b=Xdzs+gpzldfZ0zJQqW5CU0oVU6SBRCPq5XQhFpySYEkx6PHciaB1j6W+EAYlsxQNOn 0ZU/y50/wxyCXh2qJBY/s3JpkYnDQDlr7kCYWLSeR4DZTBNIdt23ss21XJGzPWxguuLx aHgalcZIJmMuByhX32KuYHkzVnv/ZsiBQl5ZSeS1fv+Zja8zSdXx9V/0vTIOFXVQlJVI PZTCueXcs6lEmY4T/AZC2qbVsYDUagr2s12ibJ/GOeWIyVEmUfaKhQ5Ix820sMziPuQx x5GF0Zf5QWry//W1D+y/+QlMz2+e5pJFmrWEuX5Oiq1WU8/6AY2hsngXIbtinUvhPkIc 0Odg== X-Gm-Message-State: AFeK/H1ODtEd9aTRRUWzxzkFHs7WYHIwEfx4g1K6U3wCDHTSD3yxc3Qd0CCzs322L03jGg== X-Received: by 10.107.50.206 with SMTP id y197mr2369070ioy.214.1490899935685; Thu, 30 Mar 2017 11:52:15 -0700 (PDT) Received: from [192.168.0.6] (c-24-18-213-211.hsd1.wa.comcast.net. [24.18.213.211]) by smtp.gmail.com with ESMTPSA id l69sm5442577itb.28.2017.03.30.11.52.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 Mar 2017 11:52:15 -0700 (PDT) From: Pat Ferrel Message-Id: <0C0C86CB-2855-4D19-B2B3-F0C220C43CBD@occamsmachete.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_2C997DF8-9D80-46F2-A5FB-3B671A59BD90" Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Subject: Re: Can I train and deploy on different machine Date: Thu, 30 Mar 2017 11:52:13 -0700 In-Reply-To: Cc: user@predictionio.incubator.apache.org, actionml-user To: Marius Rabenarivo References: <4CA3C88F-1F3F-4F24-9C3C-9DF2EBE847B6@occamsmachete.com> <25EA7897-0943-43D9-BC50-4CF25C42F67B@occamsmachete.com> <504FC93C-1A95-4673-BCB2-1C1E8CA0D487@occamsmachete.com> X-Mailer: Apple Mail (2.3259) archived-at: Thu, 30 Mar 2017 18:52:24 -0000 --Apple-Mail=_2C997DF8-9D80-46F2-A5FB-3B671A59BD90 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 In the thread below I answered this.=20 " Note that the PredictionServer should be configured to know how to = connect to Elasticsearch and HBase and optionally HDFS, only Spark needs = to be local. Note also that no config in pio-env.sh needs to change, = Spark local setup is done in the Spark conf, it has nothing to do with = PIO setup.=E2=80=9D=E2=80=9D On Mar 30, 2017, at 11:14 AM, Marius Rabenarivo = wrote: For the host where we run the training, do we have to put the path to = ES_CONF_DIR and HADOOP_CONF_DIR in pio-env.sh even if we use remote ES = and Hadoop clusters? 2017-03-30 22:09 GMT+04:00 Marius Rabenarivo >: Replace Haddop by Hadoop in the previous mail 2017-03-30 22:08 GMT+04:00 Marius Rabenarivo >: For the host where we run the training, do we have to put the path to = ES_CONF_DIR and HADOOP_CONF_DIR in pio-env.sh even if we use remote ES = and Haddop clulsters? 2017-03-30 21:58 GMT+04:00 Pat Ferrel >: To run locally in the same process as pio delete those files and do not = launch Spark as a daemon, only use PIO commands. We do not =E2=80=9Cre-deploy=E2=80=9D we hot-swap the model that = predictions are made from so the existing deployment works with the new = data automatically and without any down-time. Re-deploying means stopping the deployed process and restarting it. This = is never necessary with the UR unless engine.json config is changed. On Mar 30, 2017, at 12:47 AM, Bruno LEBON > wrote: "Spark local setup is done in the Spark conf, it has nothing to do with = PIO setup. " Hi Pat, So when you say the above, which files do you refer to? the "masters" = and "slaves" files ? So I should put localhost in those files instead of = the dns names I configured in /etc/hosts? Once this is done, I'll be able to launch=20 "nohup pio deploy --ip 0.0.0.0 --port 8001 --event-server-port 7070 = --feedback --accesskey = 4o4Te0AzGMYsc1m0nCgaGckl0vLHfQfYIALPleFKDXoQxKpUji2RF3LlpDc7rsVd -- = --driver-memory 1G > /dev/null 2>&1 &" with my Spark cluster off ? Also, I have the feeling that once the train is done, the new model is = automatically deployed, is that so? In the template Ecommerce = recommendation ,the log was explicitly telling that the model was being = deployed, whereas in Universal Recommender the log doesnt mention an = eventual automatic deploy right after the train is done. =20 2017-03-29 21:25 GMT+02:00 Pat Ferrel >: The Machine running the PredictionSever should not be configured to = connect to the Spark Cluster. This is why I explained that we use a machine for training that is a = Spark cluster =E2=80=9Cdriver=E2=80=9D machine. The driver machine = connects to the Spark cluster but the PredictionServer should not.=20 The PredictionServer should have default config that does not know how = to connect to the Spark cluster. In this case it will default to running = spark-submit to launch with MASTER=3Dlocal, which puts Spark in the same = process with the PredictionServer and you will not get the cluster = error. Note that the PredictionServer should be configured to know how = to connect to Elasticsearch and HBase and optionally HDFS, only Spark = needs to be local. Note also that no config in pio-env.sh needs to = change, Spark local setup is done in the Spark conf, it has nothing to = do with PIO setup. =20 After running `pio build` and `pio train` copy the UR directory to *the = same location* on the PredictionServer. Then, with Spark setup to be = local, on the PredictionServer machine run `pio deploy` =46rom then on = if you do not change `engine.json` you will have newly trained models = hot-swapped into all PredictionServers running the UR. On Mar 29, 2017, at 11:57 AM, Marius Rabenarivo = > wrote: Let me be more explicit. What I want to do is not using the host where PredictionServer will run = as a slave in the Spark cluser. When I do this I got "Initial job has not accepted any resources" error = message. 2017-03-29 22:18 GMT+04:00 Pat Ferrel >: yes My answer below was needlessly verbose. On Mar 28, 2017, at 8:41 AM, Marius Rabenarivo = > wrote: But I want to run the driver outside the server where I'll run the = PredictionServer. As Spark will be used only for launching there. Is it possible to run the driver outside the host where I'll deploy the = engine? I mean for deploying I'm reading documentation about Spark right now for having insight on = how I can do it but I want to know if someone has tried to do something = similar. 2017-03-28 19:34 GMT+04:00 Pat Ferrel >: Spark must be installed locally (so spark-submit will work) but Spark is = only used to launch the PredictionServer. No job is run on Spark for the = UR during query serving. We typically train on a Spark driver machine that is like part of the = Spark cluster and deploy on a server separate from the Spark cluster. = This is so that the cluster can be stopped when not training and no AWS = charges are incurred.=20 So yes you can and often there are good reasons to do so. See the Spark overview here: http://actionml.com/docs/intro_to_spark = On Mar 27, 2017, at 11:48 PM, Marius Rabenarivo = > wrote: Hello, For the pio train command, I understand that I can use another machine = with PIO, Spark Driver, Master and Worker. But, is it possible to deploy in a machine without Spark locally = installed as it is use spark-submit during deployment and=20 org.apache.predictionio.workflow.CreateServer references sparkContext. I'm using UR v0.4.2 and PredictionIO 0.10.0 Regards, Marius P.S. I also posted in the ActionML Google group forum : = https://groups.google.com/forum/#!topic/actionml-user/9yNQgVIODvI = --=20 You received this message because you are subscribed to a topic in the = Google Groups "actionml-user" group. To unsubscribe from this topic, visit = https://groups.google.com/d/topic/actionml-user/9yNQgVIODvI/unsubscribe = .= To unsubscribe from this group and all its topics, send an email to = actionml-user+unsubscribe@googlegroups.com = . To post to this group, send email to actionml-user@googlegroups.com = . To view this discussion on the web visit = https://groups.google.com/d/msgid/actionml-user/504FC93C-1A95-4673-BCB2-1C= 1E8CA0D487%40occamsmachete.com = . For more options, visit https://groups.google.com/d/optout = . --=20 You received this message because you are subscribed to the Google = Groups "actionml-user" group. To unsubscribe from this group and stop receiving emails from it, send = an email to actionml-user+unsubscribe@googlegroups.com = . To post to this group, send email to actionml-user@googlegroups.com = . To view this discussion on the web visit = https://groups.google.com/d/msgid/actionml-user/CAC-ATVEfJLZ0OE7fJy9dU-xDG= qSNJGvA6VB6xND5vCo3rWFALg%40mail.gmail.com = . For more options, visit https://groups.google.com/d/optout = . --Apple-Mail=_2C997DF8-9D80-46F2-A5FB-3B671A59BD90 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 In the thread below I answered this. 

" Note that the PredictionServer = should be configured to know how to connect to Elasticsearch and HBase = and optionally HDFS, only Spark needs to be local. Note also that no = config in pio-env.sh needs to change, Spark local setup is done in the = Spark conf, it has nothing to do with PIO setup.=E2=80=9D=E2=80=9D



On = Mar 30, 2017, at 11:14 AM, Marius Rabenarivo <mariusrabenarivo@gmail.com> wrote:

For the host where we run the training, do we have to put = the path to = ES_CONF_DIR = and = HADOOP_CONF_DIR in pio-env.sh even if we use remote ES and Hadoop = clusters?

2017-03-30 22:09 GMT+04:00 Marius Rabenarivo <mariusrabenarivo@gmail.com>:
Replace Haddop = by Hadoop in the previous mail

2017-03-30 22:08 GMT+04:00 Marius = Rabenarivo <mariusrabenarivo@gmail.com>:
For the host where we run the training, do we have to put the = path to= ES_CONF_DIR = and = HADOOP_CONF_DIR in pio-env.sh even if we use remote ES and Haddop = clulsters?
<= div class=3D"m_8584852788942316805HOEnZb">

2017-03-30 21:58 GMT+04:00 Pat = Ferrel <pat@occamsmachete.com>:
To = run locally in the same process as pio delete those files and do not = launch Spark as a daemon, only use PIO commands.

We do not =E2=80=9Cre-deploy=E2=80=9D = we hot-swap the model that predictions are made from so the existing = deployment works with the new data automatically and without any = down-time.

Re-deploying means stopping the deployed process and = restarting it. This is never necessary with the UR unless engine.json = config is changed.

On Mar 30, 2017, at 12:47 AM, Bruno LEBON <b.lebon@redfakir.fr> wrote:

"Spark local = setup is done in the Spark conf, it has nothing to do with PIO setup. =  "

Hi Pat,

So when you say the above, which files do you refer to? the = "masters" and "slaves" files ? So I should put localhost in those files = instead of the dns names I configured in /etc/hosts?
Once this is = done, I'll be able to launch 
"nohup pio deploy --ip 0.0.0.0 = --port 8001 --event-server-port 7070 --feedback --accesskey = 4o4Te0AzGMYsc1m0nCgaGckl0vLHfQfYIALPleFKDXoQxKpUji2RF3LlpDc7rsVd -- = --driver-memory 1G > /dev/null 2>&1 &"
with my Spark cluster off ?

Also, I have the = feeling that once the train is done, the new model is automatically = deployed, is that so? In the template Ecommerce recommendation ,the log = was explicitly telling that the model was being deployed, whereas in = Universal Recommender the log doesnt mention an eventual automatic = deploy right after the train is done.

 


2017-03-29 21:25 GMT+02:00 Pat = Ferrel <pat@occamsmachete.com>:
The= Machine running the PredictionSever should not be configured to connect = to the Spark Cluster.

This is why I explained that we use a machine for training = that is a Spark cluster =E2=80=9Cdriver=E2=80=9D machine. The driver = machine connects to the Spark cluster but the PredictionServer should = not. 

The = PredictionServer should have default config that does not know how to = connect to the Spark cluster. In this case it will default to running = spark-submit to launch with MASTER=3Dlocal, which puts Spark in the same = process with the PredictionServer and you will not get the cluster = error. Note that the PredictionServer should be configured to know how = to connect to Elasticsearch and HBase and optionally HDFS, only Spark = needs to be local. Note also that no config in pio-env.sh needs to = change, Spark local setup is done in the Spark conf, it has nothing to = do with PIO setup.  

After running `pio build` and `pio train` copy the UR = directory to *the same location* on the PredictionServer. Then, with = Spark setup to be local, on the PredictionServer machine run `pio = deploy` =46rom then on if you do not change `engine.json` you will have = newly trained models hot-swapped into all PredictionServers running the = UR.


On Mar 29, 2017, at 11:57 AM, Marius = Rabenarivo <mariusrabenarivo@gmail.com> = wrote:

Let me be more = explicit.

What I want to do is not = using the host where PredictionServer will run as a slave in the Spark = cluser.

When I do this I got "Initial = job has not accepted any resources" error message.

2017-03-29 22:18 GMT+04:00 Pat Ferrel <pat@occamsmachete.com>:
yes

My answer below was needlessly = verbose.


On Mar 28, = 2017, at 8:41 AM, Marius Rabenarivo <mariusrabenarivo@gmail.com> wrote:

But I want to run the driver outside the = server where I'll run the PredictionServer.

As Spark will be used only for launching there.

Is it possible to run the driver outside = the host where I'll deploy the engine? I mean for deploying

I'm reading = documentation about Spark right now for having insight on how I can do = it but I want to know if someone has tried to do something similar.

2017-03-28 19:34 GMT+04:00 Pat Ferrel <pat@occamsmachete.com>:
Spark must be installed = locally (so spark-submit will work) but Spark is only used to launch the = PredictionServer. No job is run on Spark for the UR during query = serving.

We = typically train on a Spark driver machine that is like part of the Spark = cluster and deploy on a server separate from the Spark cluster. This is = so that the cluster can be stopped when not training and no AWS charges = are incurred. 

So yes you can and often there are good reasons to do = so.

See the = Spark overview here: http://actionml.com/docs/intro_to_spark


On Mar 27, 2017, at 11:48 PM, Marius Rabenarivo <mariusrabenarivo@gmail.com> wrote:

Hello,

For the = pio train command, I understand that I can use another machine with PIO, = Spark Driver, Master and Worker.

But, is it = possible to deploy in a machine without Spark locally installed as it is = use spark-submit during deployment
and 
org.apache.predictionio.workflow.CreateServer
references sparkContext.

I'm using UR v0.4.2 and PredictionIO 0.10.0

Regards,

Marius

P.S. I also posted in the ActionML = Google group forum : https://groups.google.com/forum/#!topic/actionml-user/9yNQgVIODvI








--
You received this message because you are subscribed to a topic in the = Google Groups "actionml-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/actionml-user/9yNQgVIODvI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to actionml-user+unsubscribe@googlegroups.com.
To post to this group, send email to actionml-user@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/504FC93C-1A95-4673-BCB2-1C1E8CA0D487%40occamsmachete.com.

For more options, visit https://groups.google.com/d/optout.




--
You received this message because you are subscribed to the Google = Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send = an email to actionml-user+unsubscribe@googlegroups.com.
To post to this group, send email to actionml-user@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVEfJLZ0OE= 7fJy9dU-xDGqSNJGvA6VB6xND5vCo3rWFALg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

= --Apple-Mail=_2C997DF8-9D80-46F2-A5FB-3B671A59BD90--