Mailing-List: contact user-help@predictionio.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@predictionio.incubator.apache.org
MIME-Version: 1.0
References: <63B094F2-1EDE-4649-AC2C-9EB39135CC59@heroku.com>
 <CAD8z1J+vb+yj0MmDtUTZR8kk-ikQ5sktvo-9qqie5Oz87iVq7w@mail.gmail.com>
 <9D0DD1A7-64A6-4E38-9A3E-4C4BF35E789B@occamsmachete.com> <12CAE521-1F3C-4779-8AC4-988D9D7DFB87@heroku.com>
 <3ECA1BDF-758D-4395-8699-32677FF546BB@occamsmachete.com> <CANT7bZfngE7BR5dQqq2P2iTGSiPAK3KbwEABxEzBW4w10bKKBg@mail.gmail.com>
 <BF9AEF06-2AEF-47B0-A210-43B48C55988A@occamsmachete.com> <CANT7bZdUeH-q=o8CPOvFcvWwN9iDn_VC5OsgEAVZoT8mW8TjRA@mail.gmail.com>
 <CANT7bZdCZWxdTbGrm9a6v+8o=AiRJfsRniy7rX6MS4zpQ0rHAg@mail.gmail.com>
 <AC484AC2-BF35-44F3-9F91-A43AB942B3B2@occamsmachete.com> <CANT7bZcmwSp32xPa3tbNnYKGdO4q0PCJTTCnhQM57bxkLsGwMA@mail.gmail.com>
 <2170929D-189D-4E69-BD56-1225A0067ADB@occamsmachete.com> <CANT7bZcmdFrOSQVpjWYO31ogNeOLjjVHsmXu0GgVgmG_9W9kAQ@mail.gmail.com>
 <CANT7bZd-fCjh0Qv_7=BTkMrBANTRP3uguFzBii+ZGCnPq7420Q@mail.gmail.com>
 <CA75E92A-5260-4E81-9341-3F4714B1CBBD@heroku.com> <2A2F0CC6-400E-4AC3-B42A-3FF98618A8AA@occamsmachete.com>
 <CANT7bZe_wp_Q4j4=DS7-fYXadcT0xa6=xzPBJ9iLqOLYLBaUpg@mail.gmail.com>
 <CANT7bZccC3vtpk48vXuNSPyKJG+pbHKYaaja4CKitxJ2yipP3A@mail.gmail.com>
 <CAD8z1JL+XnTCVS9odpKCFORPVLjqkAb5Zrc3ZmpPk2mgcNzD+g@mail.gmail.com>
 <02AEFB99-2ED5-4588-B642-28544BFB9B15@occamsmachete.com> <CANT7bZfMDo6iyyh5-P0J5-JbGYL0p0-iSV=tG9F9r+u22znfbw@mail.gmail.com>
 <2648628D-E633-4C49-B953-C806A1A7A931@heroku.com>
In-Reply-To: <2648628D-E633-4C49-B953-C806A1A7A931@heroku.com>
From: Kenneth Chan <kenneth@apache.org>
Date: Thu, 13 Jul 2017 02:43:33 +0000
Message-ID: <CANT7bZdfJT5Op3BYcYnSQkyq=KYxx0RiDuVgWQff7KYUdRzPeg@mail.gmail.com>
Subject: Re: Eventserver API in an Engine?
To: user@predictionio.incubator.apache.org
Content-Type: multipart/alternative; boundary="94eb2c116010b267f1055429e632"
archived-at: Thu, 13 Jul 2017 02:43:53 -0000

--94eb2c116010b267f1055429e632
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Mars, i totally understand and agree we should make developer successful.
but Would like to understand your problem more before jump into conclusion

first, a complete PIO setup has following:
1. PIO framework layer
2. PIO administration (e.g. PIO app)
3. PIO event server
4. one or more PIO engines

the storage and setup config applied to 1 globally and the rest 2, 3, 4
would run on top of 1.

my understanding is that the Buildpack would take engine code and then
build, release and deploy it which can then serve query.

when heroku user  use buildpack,
- Where is the event server in the picture?
- How user setup the storage config for 1?
- if i use build pack to deploy another engine, does it share 1 and 2 above=
?


On Wed, Jul 12, 2017 at 3:21 PM, Mars Hall <mars@heroku.com> wrote:

> The key motivation behind this idea/request is to:
>
>     Simplify baseline PredictionIO deployment, both conceptually &
> technically.
>
> My vision with this thread is to:
>
>     Enable single-process, single network-listener PredictionIO app
> deployment
>     (i.e. Queries & Events APIs in the same process.)
>
>
> Attempting to address some previous questions & statements=E2=80=A6
>
>
> From Pat Ferrel on Tue, 11 Jul 2017 10:53:48 -0700 (PDT):
> > how much of your problem is workflow vs installation vs bundling of
> APIs? Can you explain it more?
>
> I am focused on deploying PredictionIO on Heroku via this buildpack:
>   https://github.com/heroku/predictionio-buildpack
>
> Heroku is an app-centric platform, where each app gets a single routable
> network port. By default apps get a URL like:
>   https://tdx-classi.herokuapp.com (an example PIO Classification engine)
>
> Deploying a separate Eventserver app that must be configured to share
> storage config & backends leads to all kinds of complexity, especially wh=
en
> unsuspectingly a developer might want to deploy a new engine with a
> different storage config but not realize that Eventserver is not simply
> shareable. Despite a lot of docs & discussion suggesting its share-abilit=
y,
> there is precious little documentation that presents how the multi-backen=
d
> Storage really works in PIO. (I didn't understand it until I read a bunch
> of Storage source code.)
>
>
> From Kenneth Chan on Tue, 11 Jul 2017 12:49:58 -0700 (PDT):
> > For example, one can modify the classification to train a classifier on
> the same set of data used by recommendation.
> =E2=80=A6and later on Wed, 12 Jul 2017 13:44:01 -0700:
> > My concern of embedding event server in engine is
> > - what problem are we solving by providing an illusion that events are
> only limited for one engine?
>
> This is a great ideal target, but the reality is that it takes some
> significant design & engineering to reach that level of data share-abilit=
y.
> I'm not suggesting that we do anything to undercut the possibilities of
> such a distributed architecture. I suggest that we streamline PIO for
> everyone that is not at that level of distributed architecture. Make PIO
> not *require* it.
>
> The best example I have is that you can run Spark in local mode, without
> worrying about any aspect of its ideal distributed purpose. (In fact
> PredictionIO is built on this feature of Spark!) I don't know the history
> there, but would imagine Spark was not always so friendly for small or
> embedded tasks like this.
>
>
> A huge part of my reality is seeing how many newcomers fumble around and
> get frustrated. I'm looking at PredictionIO from a very Heroku-style
> perspective of "how do we help [new] developers be successful", which is
> probably going to seem like I want to take away capabilities. I just want
> to make the onramp more graceful!
>
> *Mars
>
> ( <> .. <> )

--94eb2c116010b267f1055429e632
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div>Mars, i totally understand and agree we should make developer successf=
ul. but Would like to understand your problem more before jump into conclus=
ion<div><div><br></div><div>first, a complete PIO setup has following:</div=
><div>1. PIO framework layer</div><div>2. PIO administration (e.g. PIO app)=
</div><div>3. PIO event server=C2=A0</div><div>4. one or more PIO engines</=
div><div><br></div><div>the storage and setup config applied to 1 globally =
and the rest 2, 3, 4 would run on top of 1.</div><div><br></div><div>my und=
erstanding is that the Buildpack would take engine code and then build, rel=
ease and deploy it which can then serve query.</div><div><br></div><div>whe=
n heroku user =C2=A0use buildpack,=C2=A0</div><div>- Where is the event ser=
ver in the picture?</div><div>- How user setup the storage config for 1?</d=
iv><div>- if i use build pack to deploy another engine, does it share 1 and=
 2 above?</div><div><br></div><div><br></div><div><br></div><div><br></div>=
<div><br></div></div></div><div><div class=3D"gmail_extra"><br><div class=
=3D"gmail_quote">On Wed, Jul 12, 2017 at 3:21 PM, Mars Hall <span>&lt;<a hr=
ef=3D"mailto:mars@heroku.com" target=3D"_blank">mars@heroku.com</a>&gt;</sp=
an> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex">The key motivation behind this=
 idea/request is to:<br>
<br>
=C2=A0 =C2=A0 Simplify baseline PredictionIO deployment, both conceptually =
&amp; technically.<br>
<br>
My vision with this thread is to:<br>
<br>
=C2=A0 =C2=A0 Enable single-process, single network-listener PredictionIO a=
pp deployment<br>
=C2=A0 =C2=A0 (i.e. Queries &amp; Events APIs in the same process.)<br>
<br>
<br>
Attempting to address some previous questions &amp; statements=E2=80=A6<br>
<br>
<br>
From Pat Ferrel on Tue, 11 Jul 2017 10:53:48 -0700 (PDT):<br>
<span>&gt; how much of your problem is workflow vs installation vs bundling=
 of APIs? Can you explain it more?<br>
<br>
</span>I am focused on deploying PredictionIO on Heroku via this buildpack:=
<br>
=C2=A0 <a href=3D"https://github.com/heroku/predictionio-buildpack" rel=3D"=
noreferrer" target=3D"_blank">https://github.com/heroku/predictionio-buildp=
ack</a><br>
<br>
Heroku is an app-centric platform, where each app gets a single routable ne=
twork port. By default apps get a URL like:<br>
=C2=A0 <a href=3D"https://tdx-classi.herokuapp.com" rel=3D"noreferrer" targ=
et=3D"_blank">https://tdx-classi.herokuapp.com</a> (an example PIO Classifi=
cation engine)<br>
<br>
Deploying a separate Eventserver app that must be configured to share stora=
ge config &amp; backends leads to all kinds of complexity, especially when =
unsuspectingly a developer might want to deploy a new engine with a differe=
nt storage config but not realize that Eventserver is not simply shareable.=
 Despite a lot of docs &amp; discussion suggesting its share-ability, there=
 is precious little documentation that presents how the multi-backend Stora=
ge really works in PIO. (I didn&#39;t understand it until I read a bunch of=
 Storage source code.)<br>
<br>
<br>
From Kenneth Chan on Tue, 11 Jul 2017 12:49:58 -0700 (PDT):<br>
<span>&gt; For example, one can modify the classification to train a classi=
fier on the same set of data used by recommendation.<br>
</span>=E2=80=A6and later on Wed, 12 Jul 2017 13:44:01 -0700:<br>
<span>&gt; My concern of embedding event server in engine is<br>
&gt; - what problem are we solving by providing an illusion that events are=
 only limited for one engine?<br>
<br>
</span>This is a great ideal target, but the reality is that it takes some =
significant design &amp; engineering to reach that level of data share-abil=
ity. I&#39;m not suggesting that we do anything to undercut the possibiliti=
es of such a distributed architecture. I suggest that we streamline PIO for=
 everyone that is not at that level of distributed architecture. Make PIO n=
ot *require* it.<br>
<br>
The best example I have is that you can run Spark in local mode, without wo=
rrying about any aspect of its ideal distributed purpose. (In fact Predicti=
onIO is built on this feature of Spark!) I don&#39;t know the history there=
, but would imagine Spark was not always so friendly for small or embedded =
tasks like this.<br>
<br>
<br>
A huge part of my reality is seeing how many newcomers fumble around and ge=
t frustrated. I&#39;m looking at PredictionIO from a very Heroku-style pers=
pective of &quot;how do we help [new] developers be successful&quot;, which=
 is probably going to seem like I want to take away capabilities. I just wa=
nt to make the onramp more graceful!<br>
<br>
*Mars<br>
<br>
( &lt;&gt; .. &lt;&gt; )</blockquote></div><br></div>
</div>

--94eb2c116010b267f1055429e632--