Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id AD5F2200BA0 for ; Fri, 14 Oct 2016 19:01:17 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id AC043160ADD; Fri, 14 Oct 2016 17:01:17 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9FE12160AD3 for ; Fri, 14 Oct 2016 19:01:16 +0200 (CEST) Received: (qmail 12435 invoked by uid 500); 14 Oct 2016 17:01:15 -0000 Mailing-List: contact dev-help@streams.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@streams.incubator.apache.org Delivered-To: mailing list dev@streams.incubator.apache.org Received: (qmail 12424 invoked by uid 99); 14 Oct 2016 17:01:15 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Oct 2016 17:01:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 129F8180149 for ; Fri, 14 Oct 2016 17:01:15 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.499 X-Spam-Level: ** X-Spam-Status: No, score=2.499 tagged_above=-999 required=6.31 tests=[HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id VpiX_OeDTC1G for ; Fri, 14 Oct 2016 17:01:11 +0000 (UTC) Received: from mail-oi0-f48.google.com (mail-oi0-f48.google.com [209.85.218.48]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 15C835F4EB for ; Fri, 14 Oct 2016 17:01:11 +0000 (UTC) Received: by mail-oi0-f48.google.com with SMTP id d132so143926657oib.2 for ; Fri, 14 Oct 2016 10:01:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:message-id:in-reply-to :references:subject:mime-version; bh=/1PrxfB0YhTqgwbT9Vrp0peVyhYHmaU1c5OkL+PzXpc=; b=EF1M18cvZNW2ljf1xTV8MDgxFzUFuaSfJAoVFmLCE/uD7I2BM30AzTSDa04Cqz9hCw 2/eB7mb0s8RmX/k+wMbzoNfPiUUCyjT4Pd7MNJ3ZjfMuwZK4U6ukyW6Ne57+wRtjptxY 4JmFnH6Ca8JMRGyhsqwijy+aMChO1dTBHJEskpuJsZimn2HoMs/4xH9Rkmhh/LhuBHMh yecPjVURdz2M++nzI/1QGBY4jq/EwRxtKYGfR87m1pUmqwytY6QKMrRYcoRMX0PGs9j4 +FN9p+TQf7dECA8UT/eoKGNNM3+XyQ8EOEKkgQjePS/KokQluEOLxYvRIyaeYpUQeA7z kiPA== X-Gm-Message-State: AA6/9RkmS5Jdm3v/CXz34i3ZPFjFewcwb2O7aYJqffHkQxniJuIQ4+iHgla6WgMxrDmOlw== X-Received: by 10.202.190.137 with SMTP id o131mr9029222oif.136.1476464469113; Fri, 14 Oct 2016 10:01:09 -0700 (PDT) Received: from Steves-MacBook-Pro-3.local.mail (67-198-76-106.dyn.grandenetworks.net. [67.198.76.106]) by smtp.gmail.com with ESMTPSA id t20sm6487266ott.24.2016.10.14.10.01.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Oct 2016 10:01:07 -0700 (PDT) Date: Fri, 14 Oct 2016 12:01:06 -0500 From: sblackmon To: Trevor Grant Cc: dev@streams.incubator.apache.org Message-ID: In-Reply-To: References: Subject: Re: Ease-of-use : minimizing TTHW (time-to-hello-world) X-Mailer: Airmail (382) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="58010f52_41e7d84b_d81" archived-at: Fri, 14 Oct 2016 17:01:17 -0000 --58010f52_41e7d84b_d81 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Trevor, Awesome, thanks for giving it a shot. =C2=A0 With some recent changes we=E2=80=99re quite close to making data collect= ion with streams providers turnkey for new users I ran the following through my deployment of zeppelin - it should work fo= r you too. =C2=A0Please confirm :) Cheers, Steve =E2=80=94=E2=80=94 %dep z.reset() z.addRepo(=22apache-snapshots=22).url(=22https://repository.apache.org/co= ntent/repositories/snapshots=22).snapshot() z.load(=22org.apache.streams:streams-provider-twitter:0.4-incubating-SNAP= SHOT=22) import com.typesafe.config.=5F import org.apache.streams.config.=5F import org.apache.streams.core.=5F import java.util.Iterator import org.apache.streams.twitter.pojo.=5F import org.apache.streams.twitter.provider.=5F val hocon =3D s=22=22=22 =C2=A0 =C2=A0 twitter =7B =C2=A0 =C2=A0 =C2=A0 oauth =7B =C2=A0 =C2=A0 =C2=A0 =C2=A0consumerKey =3D =22=22 =C2=A0 =C2=A0 consumerSecret =3D =22=22 =C2=A0 =C2=A0 accessToken =3D =22=22 =C2=A0 =C2=A0 accessTokenSecret =3D =22=22 =C2=A0 =C2=A0 =C2=A0 =7D =C2=A0 =C2=A0 =C2=A0 retrySleepMs =3D 5000 =C2=A0 retryMax =3D 250 =C2=A0 info =3D =5B =C2=A0 =C2=A0 18055613 =C2=A0 =5D =C2=A0 =C2=A0 =7D =22=22=22 val typesafe =3D Config=46actory.parseString(hocon) val config =3D new ComponentConfigurator(classOf=5BTwitterUserInformation= Configuration=5D).detectConfiguration(typesafe, =22twitter=22); val provider =3D new TwitterTimelineProvider(config); provider.prepare(null) provider.startStream() while(provider.isRunning()) val resultSet =3D provider.readCurrent() resultSet.size() val iterator =3D resultSet.iterator(); while(iterator.hasNext()) =7B =C2=A0 =C2=A0 val datum =3D iterator.next(); =C2=A0 =C2=A0 println(datum.getDocument) =7D On October 14, 2016 at 8:33:55 AM, Trevor Grant (trevor.d.grant=40gmail.c= om) wrote: I agree a minimal TTHW would be good- esp a user who is trying to create = a =20 hello world. =20 I am a big fan of Apache Zeppelin notebooks for this sort of thing- easy = to =20 host and include Markdown. =20 If I could get some community assistance getting myself started, I'd be =20 happy to write it up. =20 I need to know: =20 Minimum dependencies- =20 =46rom the little work I have done so far I know this can be a murky =20 subject as we migrate version. I'd prefer to do the minimal example in =20 what ever version can be ran based on artifacts sitting in maven now. Hap= py =20 to update when new version is pushed. =20 Scala- =20 Zeppelin is for all intents and purposes like running in the Spark/=46lin= k =20 shell. I'll need some help getting things going in this sort of env. =20 If someone reading this is like =22oh that's easy, here's your dependenci= es, =20 and then run this code=22, that would be very helpful, I can get to writi= ng =20 right away. Otherwise I can hack it out, but again will need some support= . =20 tg =20 Trevor Grant =20 Data Scientist =20 https://github.com/rawkintrevo =20 http://stackexchange.com/users/3002022/rawkintrevo =20 http://trevorgrant.org =20 *=22=46ortunate is he, who is able to know the causes of things.=22 -Virg= il* =20 On Tue, Oct 11, 2016 at 11:00 AM, Matt =46ranklin =20 wrote: =20 > On Mon, Oct 10, 2016 at 11:30 AM sblackmon wro= te: =20 > =20 > > Some other projects are currently looking at publishing docker contai= ners =20 > > that people can easily extend. I am totally in favor of this approach= . =20 > > =20 > > =20 > > Docker distribution would open up a lot of cool options for this proj= ect. =20 > > =20 > > Which projects are farthest along this road=3F =20 > > =20 > =20 > https://hub.docker.com/r/apache/ =20 > =20 > =20 > > =20 > > I think even publishing this as a Docker file example on the website = =20 > would =20 > > be a good start. =20 > > =20 > > These PRs use a maven docker plugin during verify phase. =20 > > https://github.com/apache/incubator-streams-examples/pull/14 =20 > > https://github.com/apache/incubator-streams/pull/288 =20 > > =20 > > The same plugin can build tag and deploy images with goals docker:bui= ld =20 > > and docker:push . =20 > > =20 > =20 > Per policy, the only thing that should make it to repositories like Doc= ker =20 > hub and Maven Central should be released convenience binaries. =20 > =20 > =20 > > =20 > > Once these merge I=E2=80=99ll take another pass through the examples = =20 > documentation =20 > > and for each describe a few alternative processes (STREAMS-428) =20 > > =20 > > 1) Build from source, run stream from *nix shell with dist uber-jar. = =20 > > 2) Run stream with sbt interactive shell using artifacts from maven =20 > central =20 > > 3) Run stream with docker using artifacts from docker hub =20 > > =20 > > On October 10, 2016 at 8:09:45 AM, Matt =46ranklin ( =20 > m.ben.franklin=40gmail.com) =20 > > wrote: =20 > > =20 > > On Thu, Oct 6, 2016 at 2:56 PM sblackmon wro= te: =20 > > =20 > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > TL;DR I=E2=80=99ve found a way to dramatically reduce barriers to u= sing streams =20 > > as =20 > > =20 > > > a beginner. =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > Using the streams 0.3 release, it=E2=80=99s quite a headache for a = novice to =20 > use =20 > > =20 > > > streams. We have a tutorial on the website, but it=E2=80=99s quite = a journey. =20 > You =20 > > =20 > > > have to check out all three repos and install them each in order be= fore =20 > > you =20 > > =20 > > > get a jar file you could use to get data, then you can run a few =20 > > pre-canned =20 > > =20 > > > streams, and those are intermediate not beginner level. =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > In an ideal world, anyone would be able to yum or apt-get (or docke= r =20 > > pull) =20 > > =20 > > > individual providers or processors and run them on their own withou= t =20 > > =20 > > > building from source or composing them into multi-step streams. =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > We'd have increase our build and compliance complexity significantl= y to =20 > > =20 > > > publish official binaries. So what can we do to drop the learning c= urve =20 > > =20 > > > precipitously without doing that=3F =20 > > =20 > > > =20 > > =20 > > =20 > > =20 > > Some other projects are currently looking at publishing docker contai= ners =20 > > =20 > > that people can easily extend. I am totally in favor of this approach= . =20 > > =20 > > =20 > > =20 > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > Providers are really simple to run. The hard part is getting all of= the =20 > > =20 > > > right classes and configuration properties into a JVM. Inspired by = how =20 > > =20 > > > zeppelin=E2=80=99s %dep interpreter reduces the friction in composi= ng and =20 > > running a =20 > > =20 > > > scala notebook, I wanted to find a way to get the same ability from= a =20 > > linux =20 > > =20 > > > shell. =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > The commands below go from just a java installation to flat files o= f =20 > > =20 > > > twitter data in just a few minutes. =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > I think until we have binary distributions, this is how our tutoria= ls =20 > > =20 > > > should tell the world to get started with streams. =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > Thoughts=3F =20 > > =20 > > > =20 > > =20 > > =20 > > =20 > > I think even publishing this as a Docker file example on the website = =20 > would =20 > > =20 > > be a good start. =20 > > =20 > > =20 > > =20 > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > ----- =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =23 install sbtx =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > curl -s https://raw.githubusercontent.com/paulp/sbt-extras/master/s= bt =20 > > =20 > > =20 > > > /usr/bin/sbtx && chmod 0755 /usr/bin/sbtx =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =23 create a workspace =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > mkdir twitter-test; cd twitter-test; =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =23 supply a config file with credentials =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > cat > application.conf << EO=46 =20 > > =20 > > > =20 > > =20 > > > twitter =7B =20 > > =20 > > > =20 > > =20 > > > oauth =7B =20 > > =20 > > > =20 > > =20 > > > consumerKey =3D =22=22 =20 > > =20 > > > =20 > > =20 > > > consumerSecret =3D =22=22 =20 > > =20 > > > =20 > > =20 > > > accessToken =3D =22=22 =20 > > =20 > > > =20 > > =20 > > > accessTokenSecret =3D =22=22 =20 > > =20 > > > =20 > > =20 > > > =7D =20 > > =20 > > > =20 > > =20 > > > retrySleepMs =3D 5000 =20 > > =20 > > > =20 > > =20 > > > retryMax =3D 250 =20 > > =20 > > > =20 > > =20 > > > info =3D =5B =20 > > =20 > > > =20 > > =20 > > > 18055613 =20 > > =20 > > > =20 > > =20 > > > =5D =20 > > =20 > > > =20 > > =20 > > > =7D =20 > > =20 > > > =20 > > =20 > > > EO=46 =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > sbtx -210 -sbt-create =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > set resolvers +=3D =22Local Maven Repository=22 at =20 > > =20 > > > =22file://=22+Path.userHome.absolutePath+=22/.m2/repository=22 =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > set libraryDependencies +=3D =22org.apache.streams=22 % =20 > > =20 > > > =22streams-provider-twitter=22 % =220.4-incubating-SNAPSHOT=22 =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > set fork :=3D true =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > run-main =20 > > =20 > > > org.apache.streams.twitter.provider.TwitterUserInformationProvider = =20 > > =20 > > > application.conf users.txt =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > run-main org.apache.streams.twitter.provider.TwitterTimelineProvide= r =20 > > =20 > > > application.conf statuses.txt =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > set javaOptions +=3D =22-Dtwitter.endpoint=3Dfriends=22 =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > run-main org.apache.streams.twitter.provider.Twitter=46ollowingProv= ider =20 > > =20 > > > application.conf friends.txt =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > set javaOptions +=3D =22-Dtwitter.endpoint=3Dfollowers=22 =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > exit =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > ls -l =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > Steves-MacBook-Pro-3:twitter sblackmon=24 ls -l =20 > > =20 > > > =20 > > =20 > > > -rw-r--r--=40 1 sblackmon staff 356 Oct 6 11:54 application.conf =20 > > =20 > > > =20 > > =20 > > > -rw-r--r-- 1 sblackmon staff 293780 Oct 6 13:42 followers.txt =20 > > =20 > > > =20 > > =20 > > > -rw-r--r-- 1 sblackmon staff 6260 Oct 6 13:43 friends.txt =20 > > =20 > > > =20 > > =20 > > > drwxr-xr-x 3 sblackmon staff 102 Oct 6 10:17 project =20 > > =20 > > > =20 > > =20 > > > -rw-r--r-- 1 sblackmon staff 3339460 Oct 6 13:43 statuses.txt =20 > > =20 > > > =20 > > =20 > > > drwxr-xr-x 6 sblackmon staff 204 Oct 6 10:19 target =20 > > =20 > > > =20 > > =20 > > > -rw-r--r-- 1 sblackmon staff 3321 Oct 6 13:43 users.txt =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > > =20 > > =20 > > =20 > =20 --58010f52_41e7d84b_d81--