streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Geer <ch...@cxtsoftware.com>
Subject Re: [DISCUSS] Continuing the Momentum
Date Fri, 18 Apr 2014 21:04:37 GMT
Steve, while I agree with what you are saying, I still caution you to limit
the scope of the streams project. There is a big difference between
creating a tool and creating a solution. Streams has the potential to be a
solution for ingesting, aggregating and analyzing activity data (not
limited to activitystre.ms data). If you make it all about the platform all
you have is a tool other developers can use to build solutions. I think
there is value in having Streams be a solution (or at least partial
solution). If streams could collect activity data (whatever format), store
it, aggregate it and provide analytics on that data as a package I think
you've won. There could be another activity in the future to pull out some
of the infrastructure code and make another project that was a generic
processing platform that streams happened to use.

One other note is I think if storm is a requirement you are going to limit
your customer base as well.

I will say though that I'm just providing my opinion. I'm not a committer
or PMC member so I don't even really have a vote but as an outside
observer, and someone whose seen projects succeed/fail, these are a few
thoughts.

Chris

On Thu, Apr 17, 2014 at 8:27 PM, Steve Blackmon <steve@blackmon.org> wrote:

> Chris, I think you are right that the group should focus our efforts,
> and that online activities (broadly defined) are the sweet spot.  I
> just wouldn't want to give potential users or contributors the idea
> that Streams is just for ActivityStreams - which I at least associate
> with small data sets.  At least they look small viewed through Jira,
> Jive, and similar tools.  Streams is also a big data processing engine
> which can take advantage of the best features of storm or yarn while
> significantly reducing the learning curve and code complexity of those
> frameworks.


> So long as the website makes it clear that activity data is a concept
> and Streams can work regardless of how the data and metadata are
> shaped, I'm cool with "Real-time Processing for Activity Data Streams"
> as a tagline.
>
> Steve
>
> On Thu, Apr 17, 2014 at 8:04 PM, Chris Geer <chris@cxtsoftware.com> wrote:
> > On Thu, Apr 17, 2014 at 9:32 AM, Steve Blackmon <steve@blackmon.org>
> wrote:
> >
> >> >> Target audience is our potential users.  Technical in nature, but it
> >> still
> >> >> needs to be succinct.
> >> >>
> >> >
> >> > Ok, with that said, I think the tag-line should be more feature
> focused
> >> > because that can hook both the tech guys and business guys.
> >>
> >> Agreed
> >>
> >> > We also need to make careful just using the term "streams" because
> >> really this isn't a
> >> > generic stream processor (aka storm), our focus is on Activity
> Streams.
> >> > Maybe activity streams is a bad descriptor as well and Activity Data
> >> might
> >> > be better. "Real-time Processing for Activity Data Streams"???
> >> >
> >>
> >> The engine actually doesn't care whether documents being processed are
> >> activity-related or not:
> >> any JVM object that jackson can serialize and deserialize work just
> >> fine as datums.
> >>
> >> I think we can acknowledge that the community has a bias toward
> >> ActivityStreams, but we shouldn't
> >> downplay the flexibility Streams provides.  Focusing only on activity
> >> data in project messaging
> >> undercuts the fact that Streams is a powerful, flexible ESB/ETL
> >> replacement.
> >>
> >
> > My 2-cents for what it's worth. If we don't focus on a niche this won't
> > take off. ESB/ETL systems are a dime-a-dozen and to be really good in
> that
> > space is a big endeavor. I'm not saying this system couldn't fill some of
> > those needs but I think it's a bad idea to be that broad.
> >
> >>
> >> >>
> >> >>
> >> >> >
> >> >> > >
> >> >> > > ?
> >> >> > >
> >> >> > > On Thu, Apr 17, 2014 at 8:26 AM, Matt Franklin <
> >> >> m.ben.franklin@gmail.com
> >> >> > >
> >> >> > > wrote:
> >> >> > > > On Mon, Apr 14, 2014 at 5:22 PM, Renato MarroquĂ­n Mogrovejo
<
> >> >> > > > renatoj.marroquin@gmail.com> wrote:
> >> >> > > >
> >> >> > > >> Hi devs,
> >> >> > > >>
> >> >> > > >> Yeah the title was indeed compelling. You got me
on that one
> lol
> >> >> > > >> I think that you guys are right saying that for
attracting new
> >> >> people
> >> >> > > maybe
> >> >> > > >> we should try making the project's goal something
more
> >> applicable in
> >> >> > > real
> >> >> > > >> life than just being "a Lightweight server for
> ActivityStreams".
> >> >> > > >> I liked the simple explanation I heard,maybe it
was the pisco
> but
> >> >> > please
> >> >> > > >> correct me if I am wrong, "it's an abstraction layer
for
> stream
> >> >> > > processing
> >> >> > > >> engines". IMHO we have two things defined:
> >> >> > > >>
> >> >> > > >> MISION:
> >> >> > > >> 1)  A flexible data processing framework that can
run in
> multiple
> >> >> > > different
> >> >> > > >> runtimes.  The goal being to abstract platform complexity
and
> >> allow
> >> >> > for
> >> >> > > >> business logic reuse across real-time, enterprise,
web and
> >> >> stand-alone
> >> >> > > >> executions.
> >> >> > > >>
> >> >> > > >> This is what needs to be done.
> >> >> > > >
> >> >> > > >
> >> >> > > >> VISION:
> >> >> > > >> 2)  As a proving ground for the adoption of data
format
> >> standards,
> >> >> > > >> specifically ActivityStreams to start.  The community
would
> work
> >> to
> >> >> > > drive
> >> >> > > >> the adoption and evolution of such standards through
> real-world
> >> >> > > experience.
> >> >> > > >>
> >> >> > > >> This is where we would like to get at some time.
But also to
> get
> >> >> more
> >> >> > > >> community engaged, things have to simple. That is
a big issue
> we
> >> >> still
> >> >> > > have
> >> >> > > >> over in Gora, and we are trying to solve it through
talks,
> better
> >> >> > > >> tutorials, integration with other projects, and
so forth.
> >> >> > > >> Just my 2cents guys.
> >> >> > > >>
> >> >> > > >
> >> >> > > > So what is the tag line that sums up both the mission
and the
> >> vision?
> >> >> > > >
> >> >> > > >
> >> >> > > >>
> >> >> > > >>
> >> >> > > >> Renato M.
> >> >> > > >>
> >> >> > > >>
> >> >> > > >> 2014-04-14 16:31 GMT+02:00 Matt Franklin <
> >> m.ben.franklin@gmail.com
> >> >> >:
> >> >> > > >>
> >> >> > > >> > On Fri, Apr 11, 2014 at 5:01 PM, Steve Blackmon
<
> >> >> > sblackmon@apache.org
> >> >> > > >> > >wrote:
> >> >> > > >> >
> >> >> > > >> > > On Thu, Apr 10, 2014 at 4:11 PM, Matt
Franklin <
> >> >> > > >> m.ben.franklin@gmail.com
> >> >> > > >> > >
> >> >> > > >> > > wrote:
> >> >> > > >> > > > tl;dr version:
> >> >> > > >> > > >
> >> >> > > >> > > > We need to discuss things on the
list more and work to
> >> define
> >> >> > > >> streams,
> >> >> > > >> > > > update our public presence to support
this definition
> and
> >> >> > > encourage
> >> >> > > >> > > > additional engagement.
> >> >> > > >> > > >
> >> >> > > >> > > +1, +1, +1
> >> >> > > >> > >
> >> >> > > >> > > > Long version:
> >> >> > > >> > > >
> >> >> > > >> > > > For those of you unaware, Steve Blackmon
gave a nice
> talk
> >> on
> >> >> the
> >> >> > > work
> >> >> > > >> > he
> >> >> > > >> > > > has been committing to Streams at
ApacheCon.  As part of
> >> that
> >> >> > talk
> >> >> > > >> and
> >> >> > > >> > > > follow on discussions, it became
clear that we as a
> >> community
> >> >> > > need to
> >> >> > > >> > do
> >> >> > > >> > > > some serious work to define ourselves,
what we are
> building
> >> >> and
> >> >> > > why
> >> >> > > >> it
> >> >> > > >> > is
> >> >> > > >> > > > valuable to the industry.
> >> >> > > >> > > >
> >> >> > > >> > > If anyone who missed the presentation
wants to see it, I'm
> >> happy
> >> >> > to
> >> >> > > >> > > host a google hangout to run through it.
> >> >> > > >> > >
> >> >> > > >> >
> >> >> > > >> > Can you post it, or a link to it, on the website
too?
> >> >> > > >> >
> >> >> > > >> >
> >> >> > > >> > >
> >> >> > > >> > > > Our website says we are a Lightweight
server for
> >> >> > ActivityStreams.
> >> >> > > >> >  While
> >> >> > > >> > > > this is true to some degree, I think
recent
> contributions
> >> >> should
> >> >> > > >> refine
> >> >> > > >> > > > this.  The new code is really about
supporting flexible
> >> >> > > processing,
> >> >> > > >> > > > persistence and retrieval of data
in multiple runtimes
> >> using
> >> >> > > strongly
> >> >> > > >> > > > typed, normalized data formats like
ActivityStreams.
> >> >> >  Personally,
> >> >> > > I
> >> >> > > >> > think
> >> >> > > >> > > > this slightly new direction is extremely
compelling, and
> >> the
> >> >> > > reaction
> >> >> > > >> > to
> >> >> > > >> > > > Steve's talk seems to support that.
 The question
> remains
> >> how
> >> >> > does
> >> >> > > >> the
> >> >> > > >> > > > community as a whole see the project?
 What value is
> >> everyone
> >> >> > > wanting
> >> >> > > >> > to
> >> >> > > >> > > > get out of this effort?
> >> >> > > >> > > >
> >> >> > > >> > > The session tag-line which attracted ~20
attendees was
> >> >> > 'Simplifying
> >> >> > > >> > > Real-Time data integration with Apache
Streams.' From
> >> talking to
> >> >> > > >> > > coders and data scientists I always hear
frustration with
> how
> >> >> much
> >> >> > > >> > > time they spend writing code and workflow
to move bytes
> >> around
> >> >> and
> >> >> > > >> > > keep track of their data assets. I'd wager
any survey of
> >> >> prominent
> >> >> > > >> > > open-source libraries and popular commercial
APIs would
> have
> >> to
> >> >> > > >> > > conclude that schema and interface standards
are
> completely
> >> >> absent
> >> >> > > >> > > or sparsely adopted at many layers.
> >> >> > > >> > >
> >> >> > > >> > > Standards in hardware, operating systems,
networks, and
> >> >> relational
> >> >> > > >> > > databases brought about flourishing ecosystems.
I believe
> >> >> > standards
> >> >> > > in
> >> >> > > >> > > data interchange such as ActivityStreams
can do the same
> for
> >> the
> >> >> > > >> > > social web, but not everyone will embrace
standards for
> the
> >> sake
> >> >> > of
> >> >> > > >> > > standards. If we can offer integration
points to the data
> >> >> sources
> >> >> > > and
> >> >> > > >> > > repositories businesses want to work with,
and demonstrate
> >> that
> >> >> > > >> > > Streams can handle 'fire-hose' scale data
volumes with
> >> >> arbitrarily
> >> >> > > >> > > many intermediate hand-offs and processing
steps on
> messages
> >> in
> >> >> > > >> > > flight, I think we will see adoption from
enterprises
> >> looking to
> >> >> > > >> > > replace ESB-type systems that can't keep
up with the
> volume
> >> of
> >> >> > data
> >> >> > > >> > > generated (both inside and outside their
networks) that
> they
> >> >> want
> >> >> > to
> >> >> > > >> > > track.  Streams is pretty decent at ETL
as well - a
> function
> >> >> that
> >> >> > is
> >> >> > > >> > > never going away, even as the underlying
tools best
> suited to
> >> >> > > >> > > performing it at scale constantly change.
> >> >> > > >> > >
> >> >> > > >> > > This future-state I'm attempting to describe
will be a
> better
> >> >> one
> >> >> > > for
> >> >> > > >> > > researchers, hobbyists, entrepreneurs,
and consumers of
> web
> >> >> > products
> >> >> > > >> > > and services.  Configuration-driven, runtime-platform
> >> agnostic,
> >> >> > > >> > > software for real-time data exchange:
 where
> community-driven
> >> >> > > >> > > standards such as Activity Streams can
codify and evolve
> >> >> > > >> > > best-practices via running code.  That
is a vision that I
> >> think
> >> >> > will
> >> >> > > >> > > help us generate significant traction
going forward.
> >> >> > > >> > >
> >> >> > > >> >
> >> >> > > >> > Just to make sure I am understanding you correctly,
you are
> >> >> > proposing
> >> >> > > we
> >> >> > > >> > update the mission of the project to the following:
> >> >> > > >> >
> >> >> > > >> > 1)  A flexible data processing framework that
can run in
> >> multiple
> >> >> > > >> different
> >> >> > > >> > runtimes.  The goal being to abstract platform
complexity
> and
> >> >> allow
> >> >> > > for
> >> >> > > >> > business logic reuse across real-time, enterprise,
web and
> >> >> > stand-alone
> >> >> > > >> > executions.
> >> >> > > >> > 2)  As a proving ground for the adoption of
data format
> >> standards,
> >> >> > > >> > specifically ActivityStreams to start.  The
community would
> >> work
> >> >> to
> >> >> > > drive
> >> >> > > >> > the adoption and evolution of such standards
through
> real-world
> >> >> > > >> experience.
> >> >> > > >> >
> >> >> > > >> > This sounds great, though it is slightly different
than the
> >> >> > initially
> >> >> > > >> > proposed functionality.  Personally, I have
no objection to
> >> that,
> >> >> as
> >> >> > > what
> >> >> > > >> > you describe encompasses the original goals
and expands on
> >> them;
> >> >> > but,
> >> >> > > it
> >> >> > > >> > would be good for the rest of the community
to weigh in.
> >> >> > > >> >
> >> >> > > >> >
> >> >> > > >> > >
> >> >> > > >> > > > The fact that there are not clear
answers (and
> >> corresponding
> >> >> > > >> documented
> >> >> > > >> > > > statements on the website) to these
questions already
> >> means we
> >> >> > are
> >> >> > > >> not
> >> >> > > >> > > > doing a great job of following the
Apache Way.  The
> Apache
> >> Way
> >> >> > is
> >> >> > > >> about
> >> >> > > >> > > the
> >> >> > > >> > > > community and meritocratic, community-based
decision
> >> making.
> >> >> >  The
> >> >> > > ASF
> >> >> > > >> > > > defines it as follows:
> >> >> > > >> > > >
> >> >> > > >> > > > While there is not an official list,
these six
> principles
> >> have
> >> >> > > been
> >> >> > > >> > cited
> >> >> > > >> > > > as the core beliefs of philosophy
behind the foundation,
> >> which
> >> >> > is
> >> >> > > >> > > normally
> >> >> > > >> > > > referred to as "The Apache Way":
> >> >> > > >> > > >
> >> >> > > >> > > > collaborative software development
> >> >> > > >> > > >
> >> >> > > >> > > > commercial-friendly standard license
> >> >> > > >> > > >
> >> >> > > >> > > > consistently high quality software
> >> >> > > >> > > >
> >> >> > > >> > > > respectful, honest, technical-based
interaction
> >> >> > > >> > > >
> >> >> > > >> > > > faithful implementation of standards
> >> >> > > >> > > >
> >> >> > > >> > > > security as a mandatory feature
> >> >> > > >> > > >
> >> >> > > >> > > > All of the ASF projects share these
principles.
> >> >> > > >> > > >
> >> >> > > >> > > > Let's make sure we propose changes
to the list, create
> >> tickets
> >> >> > > that
> >> >> > > >> > > support
> >> >> > > >> > > > wider efforts and leverage principles
like lazy
> consensus
> >> to
> >> >> > keep
> >> >> > > >> > moving
> >> >> > > >> > > > forward in a way that supports the
community.
> >> >> > > >> > > +1, +1, +1
> >> >> > > >> > >
> >> >> > > >> > > On Thu, Apr 10, 2014 at 4:11 PM, Matt
Franklin <
> >> >> > > >> m.ben.franklin@gmail.com
> >> >> > > >> > >
> >> >> > > >> > > wrote:
> >> >> > > >> > > > tl;dr version:
> >> >> > > >> > > >
> >> >> > > >> > > > We need to discuss things on the
list more and work to
> >> define
> >> >> > > >> streams,
> >> >> > > >> > > > update our public presence to support
this definition
> and
> >> >> > > encourage
> >> >> > > >> > > > additional engagement.
> >> >> > > >> > > >
> >> >> > > >> > > > Long version:
> >> >> > > >> > > >
> >> >> > > >> > > > For those of you unaware, Steve Blackmon
gave a nice
> talk
> >> on
> >> >> the
> >> >> > > work
> >> >> > > >> > he
> >> >> > > >> > > > has been committing to Streams at
ApacheCon.  As part of
> >> that
> >> >> > talk
> >> >> > > >> and
> >> >> > > >> > > > follow on discussions, it became
clear that we as a
> >> community
> >> >> > > need to
> >> >> > > >> > do
> >> >> > > >> > > > some serious work to define ourselves,
what we are
> building
> >> >> and
> >> >> > > why
> >> >> > > >> it
> >> >> > > >> > is
> >> >> > > >> > > > valuable to the industry.
> >> >> > > >> > > >
> >> >> > > >> > > > Our website says we are a Lightweight
server for
> >> >> > ActivityStreams.
> >> >> > > >> >  While
> >> >> > > >> > > > this is true to some degree, I think
recent
> contributions
> >> >> should
> >> >> > > >> refine
> >> >> > > >> > > > this.  The new code is really about
supporting flexible
> >> >> > > processing,
> >> >> > > >> > > > persistence and retrieval of data
in multiple runtimes
> >> using
> >> >> > > strongly
> >> >> > > >> > > > typed, normalized data formats like
ActivityStreams.
> >> >> >  Personally,
> >> >> > > I
> >> >> > > >> > think
> >> >> > > >> > > > this slightly new direction is extremely
compelling, and
> >> the
> >> >> > > reaction
> >> >> > > >> > to
> >> >> > > >> > > > Steve's talk seems to support that.
 The question
> remains
> >> how
> >> >> > does
> >> >> > > >> the
> >> >> > > >> > > > community as a whole see the project?
 What value is
> >> everyone
> >> >> > > wanting
> >> >> > > >> > to
> >> >> > > >> > > > get out of this effort?
> >> >> > > >> > > >
> >> >> > > >> > > > The fact that there are not clear
answers (and
> >> corresponding
> >> >> > > >> documented
> >> >> > > >> > > > statements on the website) to these
questions already
> >> means we
> >> >> > are
> >> >> > > >> not
> >> >> > > >> > > > doing a great job of following the
Apache Way.  The
> Apache
> >> Way
> >> >> > is
> >> >> > > >> about
> >> >> > > >> > > the
> >> >> > > >> > > > community and meritocratic, community-based
decision
> >> making.
> >> >> >  The
> >> >> > > ASF
> >> >> > > >> > > > defines it as follows:
> >> >> > > >> > > >
> >> >> > > >> > > > While there is not an official list,
these six
> principles
> >> have
> >> >> > > been
> >> >> > > >> > cited
> >> >> > > >> > > > as the core beliefs of philosophy
behind the foundation,
> >> which
> >> >> > is
> >> >> > > >> > > normally
> >> >> > > >> > > > referred to as "The Apache Way":
> >> >> > > >> > > >
> >> >> > > >> > > > collaborative software development
> >> >> > > >> > > >
> >> >> > > >> > > > commercial-friendly standard license
> >> >> > > >> > > >
> >> >> > > >> > > > consistently high quality software
> >> >> > > >> > > >
> >> >> > > >> > > > respectful, honest, technical-based
interaction
> >> >> > > >> > > >
> >> >> > > >> > > > faithful implementation of standards
> >> >> > > >> > > >
> >> >> > > >> > > > security as a mandatory feature
> >> >> > > >> > > >
> >> >> > > >> > > > All of the ASF projects share these
principles.
> >> >> > > >> > > >
> >> >> > > >> > > > Let's make sure we propose changes
to the list, create
> >> tickets
> >> >> > > that
> >> >> > > >> > > support
> >> >> > > >> > > > wider efforts and leverage principles
like lazy
> consensus
> >> to
> >> >> > keep
> >> >> > > >> > moving
> >> >> > > >> > > > forward in a way that supports the
community.
> >> >> > > >> > >
> >> >> > > >> > >
> >> >> > > >> > >
> >> >> > > >> > > --
> >> >> > > >> > > Steve Blackmon
> >> >> > > >> > > sblackmon@apache.org
> >> >> > > >> > >
> >> >> > > >> >
> >> >> > > >>
> >> >> > >
> >> >> >
> >> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message