From dev-return-4180-apmail-streams-dev-archive=streams.apache.org@streams.incubator.apache.org Mon Oct 17 15:36:51 2016
Return-Path:
X-Original-To: apmail-streams-dev-archive@minotaur.apache.org
Delivered-To: apmail-streams-dev-archive@minotaur.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id 18EB219F6C
for ; Mon, 17 Oct 2016 15:36:51 +0000 (UTC)
Received: (qmail 97484 invoked by uid 500); 17 Oct 2016 15:36:51 -0000
Delivered-To: apmail-streams-dev-archive@streams.apache.org
Received: (qmail 97421 invoked by uid 500); 17 Oct 2016 15:36:50 -0000
Mailing-List: contact dev-help@streams.incubator.apache.org; run by ezmlm
Precedence: bulk
List-Help:
List-Unsubscribe:
List-Post:
List-Id:
Reply-To: dev@streams.incubator.apache.org
Delivered-To: mailing list dev@streams.incubator.apache.org
Received: (qmail 97406 invoked by uid 99); 17 Oct 2016 15:36:50 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142)
by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Oct 2016 15:36:50 +0000
Received: from localhost (localhost [127.0.0.1])
by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id EDA481A06F7
for ; Mon, 17 Oct 2016 15:36:49 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: 2.48
X-Spam-Level: **
X-Spam-Status: No, score=2.48 tagged_above=-999 required=6.31
tests=[HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=2,
RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01,
RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001]
autolearn=disabled
Received: from mx1-lw-eu.apache.org ([10.40.0.8])
by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024)
with ESMTP id 4ak4fzVRNu_i for ;
Mon, 17 Oct 2016 15:36:47 +0000 (UTC)
Received: from mail-oi0-f50.google.com (mail-oi0-f50.google.com [209.85.218.50])
by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 6FD145F299
for ; Mon, 17 Oct 2016 15:36:46 +0000 (UTC)
Received: by mail-oi0-f50.google.com with SMTP id d132so216682975oib.2
for ; Mon, 17 Oct 2016 08:36:46 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20130820;
h=x-gm-message-state:date:from:to:cc:message-id:in-reply-to
:references:subject:mime-version;
bh=ZEsUDJBmDyrGVzrkhajOMgrdwl/bL94gy4kyJsRKNsk=;
b=nIAxjJz7q7V3ZESBG/8vQHQvkqNYpbcJoz2797SNe9lFz9QEWUhQqhQLX4oxSobfoO
mZNjAEBTgk6s37B+JUcXT0P5X+e1HRk+ofFGXy7/NR2sBEoBfWIpNuwPyAU9/DL+bf3n
6OZDWrANuBqeez9eX0U42xO2EXFBEVrDjt7B2VB9mUZ2ry78GtR1qnPx72QApgS1G3Nq
3YvA7Tvk3NZBX58TtgNZC8Hui0uNtjbWSMtiSnn5++TAILdwdxCelF7o2RzNgHk8Px3c
TWy06mQYx/bW07Y56j+vQvm7LMj43wbvm+rQGR85rqsmdmCcdK8L1x4FiF0Db4Ok/to/
IjOw==
X-Gm-Message-State: AA6/9RnGttHkPq5PUw8/cmMknA0HUz2XlNgrtR1YXC8O2Zzyu4fxrsuFTgWsazTCLmwMew==
X-Received: by 10.202.80.202 with SMTP id e193mr17730354oib.65.1476718599041;
Mon, 17 Oct 2016 08:36:39 -0700 (PDT)
Received: from Steves-MacBook-Pro-3.local.mail (67-198-76-106.dyn.grandenetworks.net. [67.198.76.106])
by smtp.gmail.com with ESMTPSA id t7sm10960083otb.25.2016.10.17.08.36.38
(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
Mon, 17 Oct 2016 08:36:38 -0700 (PDT)
Date: Mon, 17 Oct 2016 10:36:37 -0500
From: sblackmon
To: dev@streams.incubator.apache.org
Cc: Matt Franklin
Message-ID:
In-Reply-To:
References:
Subject: Distribution / Docker next steps
X-Mailer: Airmail (382)
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="5804f005_2d05add2_d81"
--5804f005_2d05add2_d81
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
On October 11, 2016 at 11:01:18 AM, Matt =46ranklin (m.ben.franklin=40gma=
il.com) wrote:
On Mon, Oct 10, 2016 at 11:30 AM sblackmon wrote=
: =20
> Some other projects are currently looking at publishing docker containe=
rs =20
> that people can easily extend. I am totally in favor of this approach. =
=20
> =20
> =20
> Docker distribution would open up a lot of cool options for this projec=
t. =20
> =20
> Which projects are farthest along this road=3F =20
> =20
https://hub.docker.com/r/apache/ =20
I had been thinking more along the lines of publishing a distribution for=
each provider, processor, and persister module containing a minimal uber=
-jar. =C2=A0Going this route would probably warrant a dedicated organizat=
ion for streams. =C2=A0OTOH, if we get to the point of having a binary di=
stribution containing all of the classes in streams-project, that could b=
e published to a top-level /apache repository and perform all of the same=
work (probably with a much larger docker image)
> =20
> I think even publishing this as a Docker file example on the website wo=
uld =20
> be a good start. =20
> =20
> These PRs use a maven docker plugin during verify phase. =20
> https://github.com/apache/incubator-streams-examples/pull/14 =20
> https://github.com/apache/incubator-streams/pull/288 =20
> =20
> The same plugin can build tag and deploy images with goals docker:build=
=20
> and docker:push . =20
> =20
Per policy, the only thing that should make it to repositories like Docke=
r =20
hub and Maven Central should be released convenience binaries. =20
I think the next step is to figure out what would need to happen to build=
, certify, and publish a convenience binary and docker image for (initial=
ly) just one one individual provider module in an upcoming releases. =C2=A0=
The dependency tree for a single provider will be more tractable than for=
the whole project and there=E2=80=99s a clear user benefit - greatly sim=
plified project tutorial.
> =20
> Once these merge I=E2=80=99ll take another pass through the examples do=
cumentation =20
> and for each describe a few alternative processes (STREAMS-428) =20
> =20
> 1) Build from source, run stream from *nix shell with dist uber-jar. =20
> 2) Run stream with sbt interactive shell using artifacts from maven cen=
tral =20
> 3) Run stream with docker using artifacts from docker hub =20
> =20
> On October 10, 2016 at 8:09:45 AM, Matt =46ranklin (m.ben.franklin=40gm=
ail.com) =20
> wrote: =20
> =20
> On Thu, Oct 6, 2016 at 2:56 PM sblackmon wrote=
: =20
> =20
> =20
> =20
> > =20
> =20
> > =20
> =20
> > TL;DR I=E2=80=99ve found a way to dramatically reduce barriers to usi=
ng streams =20
> as =20
> =20
> > a beginner. =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > Using the streams 0.3 release, it=E2=80=99s quite a headache for a no=
vice to use =20
> =20
> > streams. We have a tutorial on the website, but it=E2=80=99s quite a =
journey. You =20
> =20
> > have to check out all three repos and install them each in order befo=
re =20
> you =20
> =20
> > get a jar file you could use to get data, then you can run a few =20
> pre-canned =20
> =20
> > streams, and those are intermediate not beginner level. =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > In an ideal world, anyone would be able to yum or apt-get (or docker =
=20
> pull) =20
> =20
> > individual providers or processors and run them on their own without =
=20
> =20
> > building from source or composing them into multi-step streams. =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > We'd have increase our build and compliance complexity significantly =
to =20
> =20
> > publish official binaries. So what can we do to drop the learning cur=
ve =20
> =20
> > precipitously without doing that=3F =20
> =20
> > =20
> =20
> =20
> =20
> Some other projects are currently looking at publishing docker containe=
rs =20
> =20
> that people can easily extend. I am totally in favor of this approach. =
=20
> =20
> =20
> =20
> =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > Providers are really simple to run. The hard part is getting all of t=
he =20
> =20
> > right classes and configuration properties into a JVM. Inspired by ho=
w =20
> =20
> > zeppelin=E2=80=99s %dep interpreter reduces the friction in composing=
and =20
> running a =20
> =20
> > scala notebook, I wanted to find a way to get the same ability from a=
=20
> linux =20
> =20
> > shell. =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > The commands below go from just a java installation to flat files of =
=20
> =20
> > twitter data in just a few minutes. =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > I think until we have binary distributions, this is how our tutorials=
=20
> =20
> > should tell the world to get started with streams. =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > Thoughts=3F =20
> =20
> > =20
> =20
> =20
> =20
> I think even publishing this as a Docker file example on the website wo=
uld =20
> =20
> be a good start. =20
> =20
> =20
> =20
> =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > ----- =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > =23 install sbtx =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > curl -s https://raw.githubusercontent.com/paulp/sbt-extras/master/sbt=
> =20
> =20
> > /usr/bin/sbtx && chmod 0755 /usr/bin/sbtx =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > =23 create a workspace =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > mkdir twitter-test; cd twitter-test; =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > =23 supply a config file with credentials =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > cat > application.conf << EO=46 =20
> =20
> > =20
> =20
> > twitter =7B =20
> =20
> > =20
> =20
> > oauth =7B =20
> =20
> > =20
> =20
> > consumerKey =3D =22=22 =20
> =20
> > =20
> =20
> > consumerSecret =3D =22=22 =20
> =20
> > =20
> =20
> > accessToken =3D =22=22 =20
> =20
> > =20
> =20
> > accessTokenSecret =3D =22=22 =20
> =20
> > =20
> =20
> > =7D =20
> =20
> > =20
> =20
> > retrySleepMs =3D 5000 =20
> =20
> > =20
> =20
> > retryMax =3D 250 =20
> =20
> > =20
> =20
> > info =3D =5B =20
> =20
> > =20
> =20
> > 18055613 =20
> =20
> > =20
> =20
> > =5D =20
> =20
> > =20
> =20
> > =7D =20
> =20
> > =20
> =20
> > EO=46 =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > sbtx -210 -sbt-create =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > set resolvers +=3D =22Local Maven Repository=22 at =20
> =20
> > =22file://=22+Path.userHome.absolutePath+=22/.m2/repository=22 =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > set libraryDependencies +=3D =22org.apache.streams=22 % =20
> =20
> > =22streams-provider-twitter=22 % =220.4-incubating-SNAPSHOT=22 =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > set fork :=3D true =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > run-main =20
> =20
> > org.apache.streams.twitter.provider.TwitterUserInformationProvider =20
> =20
> > application.conf users.txt =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > run-main org.apache.streams.twitter.provider.TwitterTimelineProvider =
=20
> =20
> > application.conf statuses.txt =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > set javaOptions +=3D =22-Dtwitter.endpoint=3Dfriends=22 =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > run-main org.apache.streams.twitter.provider.Twitter=46ollowingProvid=
er =20
> =20
> > application.conf friends.txt =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > set javaOptions +=3D =22-Dtwitter.endpoint=3Dfollowers=22 =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > exit =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > ls -l =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > Steves-MacBook-Pro-3:twitter sblackmon=24 ls -l =20
> =20
> > =20
> =20
> > -rw-r--r--=40 1 sblackmon staff 356 Oct 6 11:54 application.conf =20
> =20
> > =20
> =20
> > -rw-r--r-- 1 sblackmon staff 293780 Oct 6 13:42 followers.txt =20
> =20
> > =20
> =20
> > -rw-r--r-- 1 sblackmon staff 6260 Oct 6 13:43 friends.txt =20
> =20
> > =20
> =20
> > drwxr-xr-x 3 sblackmon staff 102 Oct 6 10:17 project =20
> =20
> > =20
> =20
> > -rw-r--r-- 1 sblackmon staff 3339460 Oct 6 13:43 statuses.txt =20
> =20
> > =20
> =20
> > drwxr-xr-x 6 sblackmon staff 204 Oct 6 10:19 target =20
> =20
> > =20
> =20
> > -rw-r--r-- 1 sblackmon staff 3321 Oct 6 13:43 users.txt =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> > =20
> =20
> =20
--5804f005_2d05add2_d81--