streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Franklin <m.ben.frank...@gmail.com>
Subject Re: Ease-of-use : minimizing TTHW (time-to-hello-world)
Date Mon, 10 Oct 2016 13:09:21 GMT
On Thu, Oct 6, 2016 at 2:56 PM sblackmon <sblackmon@apache.org> wrote:

>
>
> TL;DR I’ve found a way to dramatically reduce barriers to using streams as
> a beginner.
>
>
>
> Using the streams 0.3 release, it’s quite a headache for a novice to use
> streams. We have a tutorial on the website, but it’s quite a journey. You
> have to check out all three repos and install them each in order before you
> get a jar file you could use to get data, then you can run a few pre-canned
> streams, and those are intermediate not beginner level.
>
>
>
> In an ideal world, anyone would be able to yum or apt-get (or docker pull)
> individual providers or processors and run them on their own without
> building from source or composing them into multi-step streams.
>
>
>
> We'd have increase our build and compliance complexity significantly to
> publish official binaries. So what can we do to drop the learning curve
> precipitously without doing that?
>

Some other projects are currently looking at publishing docker containers
that people can easily extend.  I am totally in favor of this approach.


>
>
>
> Providers are really simple to run. The hard part is getting all of the
> right classes and configuration properties into a JVM. Inspired by how
> zeppelin’s %dep interpreter reduces the friction in composing and running a
> scala notebook, I wanted to find a way to get the same ability from a linux
> shell.
>
>
>
> The commands below go from just a java installation to flat files of
> twitter data in just a few minutes.
>
>
>
> I think until we have binary distributions, this is how our tutorials
> should tell the world to get started with streams.
>
>
>
> Thoughts?
>

I think even publishing this as a Docker file example on the website would
be a good start.


>
>
>
> -----
>
>
>
> # install sbtx
>
>
>
> curl -s https://raw.githubusercontent.com/paulp/sbt-extras/master/sbt >
> /usr/bin/sbtx && chmod 0755 /usr/bin/sbtx
>
>
>
> # create a workspace
>
>
>
> mkdir twitter-test; cd twitter-test;
>
>
>
> # supply a config file with credentials
>
>
>
> cat > application.conf << EOF
>
> twitter {
>
>   oauth {
>
>     consumerKey = ""
>
>     consumerSecret = ""
>
>     accessToken = ""
>
>     accessTokenSecret = ""
>
>   }
>
>   retrySleepMs = 5000
>
>   retryMax = 250
>
>   info = [
>
>     18055613
>
>   ]
>
> }
>
> EOF
>
>
>
> sbtx -210 -sbt-create
>
>
>
> set resolvers += "Local Maven Repository" at
> "file://"+Path.userHome.absolutePath+"/.m2/repository"
>
>
>
> set libraryDependencies += "org.apache.streams" %
> "streams-provider-twitter" % "0.4-incubating-SNAPSHOT"
>
>
>
> set fork := true
>
>
>
> run-main
> org.apache.streams.twitter.provider.TwitterUserInformationProvider
> application.conf users.txt
>
>
>
> run-main org.apache.streams.twitter.provider.TwitterTimelineProvider
> application.conf statuses.txt
>
>
>
> set javaOptions += "-Dtwitter.endpoint=friends"
>
>
>
> run-main org.apache.streams.twitter.provider.TwitterFollowingProvider
> application.conf friends.txt
>
>
>
> set javaOptions += "-Dtwitter.endpoint=followers"
>
>
>
> exit
>
>
>
> ls -l
>
>
>
> Steves-MacBook-Pro-3:twitter sblackmon$ ls -l
>
> -rw-r--r--@ 1 sblackmon staff 356 Oct 6 11:54 application.conf
>
> -rw-r--r-- 1 sblackmon staff 293780 Oct 6 13:42 followers.txt
>
> -rw-r--r-- 1 sblackmon staff 6260 Oct 6 13:43 friends.txt
>
> drwxr-xr-x 3 sblackmon staff 102 Oct 6 10:17 project
>
> -rw-r--r-- 1 sblackmon staff 3339460 Oct 6 13:43 statuses.txt
>
> drwxr-xr-x 6 sblackmon staff 204 Oct 6 10:19 target
>
> -rw-r--r-- 1 sblackmon staff 3321 Oct 6 13:43 users.txt
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message