Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 28C6A200CD2 for ; Thu, 13 Jul 2017 04:43:53 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 2715516AE5E; Thu, 13 Jul 2017 02:43:53 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 20C7216AE53 for ; Thu, 13 Jul 2017 04:43:51 +0200 (CEST) Received: (qmail 8331 invoked by uid 500); 13 Jul 2017 02:43:51 -0000 Mailing-List: contact user-help@predictionio.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@predictionio.incubator.apache.org Delivered-To: mailing list user@predictionio.incubator.apache.org Received: (qmail 8322 invoked by uid 99); 13 Jul 2017 02:43:51 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Jul 2017 02:43:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id D1575C023E for ; Thu, 13 Jul 2017 02:43:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -3.022 X-Spam-Level: X-Spam-Status: No, score=-3.022 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id XbX3veg59IJb for ; Thu, 13 Jul 2017 02:43:47 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id F1E105F341 for ; Thu, 13 Jul 2017 02:43:45 +0000 (UTC) Received: (qmail 8304 invoked by uid 99); 13 Jul 2017 02:43:45 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Jul 2017 02:43:45 +0000 Received: from mail-pg0-f54.google.com (mail-pg0-f54.google.com [74.125.83.54]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id F0C691A0044 for ; Thu, 13 Jul 2017 02:43:44 +0000 (UTC) Received: by mail-pg0-f54.google.com with SMTP id k14so22184605pgr.0 for ; Wed, 12 Jul 2017 19:43:44 -0700 (PDT) X-Gm-Message-State: AIVw1116jLFTAr6TwrJjbOnGYsZpc2xReGYQCD4hjrR9DvroUXFHFn1b 6cPh6NutiR16/QQrEnZRGJ1hLHplTw== X-Received: by 10.99.49.20 with SMTP id x20mr7008052pgx.181.1499913824395; Wed, 12 Jul 2017 19:43:44 -0700 (PDT) MIME-Version: 1.0 References: <63B094F2-1EDE-4649-AC2C-9EB39135CC59@heroku.com> <9D0DD1A7-64A6-4E38-9A3E-4C4BF35E789B@occamsmachete.com> <12CAE521-1F3C-4779-8AC4-988D9D7DFB87@heroku.com> <3ECA1BDF-758D-4395-8699-32677FF546BB@occamsmachete.com> <2170929D-189D-4E69-BD56-1225A0067ADB@occamsmachete.com> <2A2F0CC6-400E-4AC3-B42A-3FF98618A8AA@occamsmachete.com> <02AEFB99-2ED5-4588-B642-28544BFB9B15@occamsmachete.com> <2648628D-E633-4C49-B953-C806A1A7A931@heroku.com> In-Reply-To: <2648628D-E633-4C49-B953-C806A1A7A931@heroku.com> From: Kenneth Chan Date: Thu, 13 Jul 2017 02:43:33 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Eventserver API in an Engine? To: user@predictionio.incubator.apache.org Content-Type: multipart/alternative; boundary="94eb2c116010b267f1055429e632" archived-at: Thu, 13 Jul 2017 02:43:53 -0000 --94eb2c116010b267f1055429e632 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Mars, i totally understand and agree we should make developer successful. but Would like to understand your problem more before jump into conclusion first, a complete PIO setup has following: 1. PIO framework layer 2. PIO administration (e.g. PIO app) 3. PIO event server 4. one or more PIO engines the storage and setup config applied to 1 globally and the rest 2, 3, 4 would run on top of 1. my understanding is that the Buildpack would take engine code and then build, release and deploy it which can then serve query. when heroku user use buildpack, - Where is the event server in the picture? - How user setup the storage config for 1? - if i use build pack to deploy another engine, does it share 1 and 2 above= ? On Wed, Jul 12, 2017 at 3:21 PM, Mars Hall wrote: > The key motivation behind this idea/request is to: > > Simplify baseline PredictionIO deployment, both conceptually & > technically. > > My vision with this thread is to: > > Enable single-process, single network-listener PredictionIO app > deployment > (i.e. Queries & Events APIs in the same process.) > > > Attempting to address some previous questions & statements=E2=80=A6 > > > From Pat Ferrel on Tue, 11 Jul 2017 10:53:48 -0700 (PDT): > > how much of your problem is workflow vs installation vs bundling of > APIs? Can you explain it more? > > I am focused on deploying PredictionIO on Heroku via this buildpack: > https://github.com/heroku/predictionio-buildpack > > Heroku is an app-centric platform, where each app gets a single routable > network port. By default apps get a URL like: > https://tdx-classi.herokuapp.com (an example PIO Classification engine) > > Deploying a separate Eventserver app that must be configured to share > storage config & backends leads to all kinds of complexity, especially wh= en > unsuspectingly a developer might want to deploy a new engine with a > different storage config but not realize that Eventserver is not simply > shareable. Despite a lot of docs & discussion suggesting its share-abilit= y, > there is precious little documentation that presents how the multi-backen= d > Storage really works in PIO. (I didn't understand it until I read a bunch > of Storage source code.) > > > From Kenneth Chan on Tue, 11 Jul 2017 12:49:58 -0700 (PDT): > > For example, one can modify the classification to train a classifier on > the same set of data used by recommendation. > =E2=80=A6and later on Wed, 12 Jul 2017 13:44:01 -0700: > > My concern of embedding event server in engine is > > - what problem are we solving by providing an illusion that events are > only limited for one engine? > > This is a great ideal target, but the reality is that it takes some > significant design & engineering to reach that level of data share-abilit= y. > I'm not suggesting that we do anything to undercut the possibilities of > such a distributed architecture. I suggest that we streamline PIO for > everyone that is not at that level of distributed architecture. Make PIO > not *require* it. > > The best example I have is that you can run Spark in local mode, without > worrying about any aspect of its ideal distributed purpose. (In fact > PredictionIO is built on this feature of Spark!) I don't know the history > there, but would imagine Spark was not always so friendly for small or > embedded tasks like this. > > > A huge part of my reality is seeing how many newcomers fumble around and > get frustrated. I'm looking at PredictionIO from a very Heroku-style > perspective of "how do we help [new] developers be successful", which is > probably going to seem like I want to take away capabilities. I just want > to make the onramp more graceful! > > *Mars > > ( <> .. <> ) --94eb2c116010b267f1055429e632 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Mars, i totally understand and agree we should make developer successf= ul. but Would like to understand your problem more before jump into conclus= ion

first, a complete PIO setup has following:
1. PIO framework layer
2. PIO administration (e.g. PIO app)=
3. PIO event server=C2=A0
4. one or more PIO engines

the storage and setup config applied to 1 globally = and the rest 2, 3, 4 would run on top of 1.

my und= erstanding is that the Buildpack would take engine code and then build, rel= ease and deploy it which can then serve query.

whe= n heroku user =C2=A0use buildpack,=C2=A0
- Where is the event ser= ver in the picture?
- How user setup the storage config for 1?
- if i use build pack to deploy another engine, does it share 1 and= 2 above?




=


On Wed, Jul 12, 2017 at 3:21 PM, Mars Hall <mars@heroku.com> wrote:
The key motivation behind this= idea/request is to:

=C2=A0 =C2=A0 Simplify baseline PredictionIO deployment, both conceptually = & technically.

My vision with this thread is to:

=C2=A0 =C2=A0 Enable single-process, single network-listener PredictionIO a= pp deployment
=C2=A0 =C2=A0 (i.e. Queries & Events APIs in the same process.)


Attempting to address some previous questions & statements=E2=80=A6


From Pat Ferrel on Tue, 11 Jul 2017 10:53:48 -0700 (PDT):
> how much of your problem is workflow vs installation vs bundling= of APIs? Can you explain it more?

I am focused on deploying PredictionIO on Heroku via this buildpack:=
=C2=A0 https://github.com/heroku/predictionio-buildp= ack

Heroku is an app-centric platform, where each app gets a single routable ne= twork port. By default apps get a URL like:
=C2=A0 https://tdx-classi.herokuapp.com (an example PIO Classifi= cation engine)

Deploying a separate Eventserver app that must be configured to share stora= ge config & backends leads to all kinds of complexity, especially when = unsuspectingly a developer might want to deploy a new engine with a differe= nt storage config but not realize that Eventserver is not simply shareable.= Despite a lot of docs & discussion suggesting its share-ability, there= is precious little documentation that presents how the multi-backend Stora= ge really works in PIO. (I didn't understand it until I read a bunch of= Storage source code.)


From Kenneth Chan on Tue, 11 Jul 2017 12:49:58 -0700 (PDT):
> For example, one can modify the classification to train a classi= fier on the same set of data used by recommendation.
=E2=80=A6and later on Wed, 12 Jul 2017 13:44:01 -0700:
> My concern of embedding event server in engine is
> - what problem are we solving by providing an illusion that events are= only limited for one engine?

This is a great ideal target, but the reality is that it takes some = significant design & engineering to reach that level of data share-abil= ity. I'm not suggesting that we do anything to undercut the possibiliti= es of such a distributed architecture. I suggest that we streamline PIO for= everyone that is not at that level of distributed architecture. Make PIO n= ot *require* it.

The best example I have is that you can run Spark in local mode, without wo= rrying about any aspect of its ideal distributed purpose. (In fact Predicti= onIO is built on this feature of Spark!) I don't know the history there= , but would imagine Spark was not always so friendly for small or embedded = tasks like this.


A huge part of my reality is seeing how many newcomers fumble around and ge= t frustrated. I'm looking at PredictionIO from a very Heroku-style pers= pective of "how do we help [new] developers be successful", which= is probably going to seem like I want to take away capabilities. I just wa= nt to make the onramp more graceful!

*Mars

( <> .. <> )

--94eb2c116010b267f1055429e632--