streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Danny Sullivan <dsulliv...@hotmail.com>
Subject RE: [DISCUSS] Continuing the Momentum
Date Fri, 18 Apr 2014 21:11:24 GMT
"If streams could collect activity data (whatever format), store
it, aggregate it and provide analytics on that data as a package I think
you've won."


+1

I think Chris and I had the same idea at the same time

> Date: Fri, 18 Apr 2014 14:04:37 -0700
> Subject: Re: [DISCUSS] Continuing the Momentum
> From: chris@cxtsoftware.com
> To: dev@streams.incubator.apache.org
> 
> Steve, while I agree with what you are saying, I still caution you to limit
> the scope of the streams project. There is a big difference between
> creating a tool and creating a solution. Streams has the potential to be a
> solution for ingesting, aggregating and analyzing activity data (not
> limited to activitystre.ms data). If you make it all about the platform all
> you have is a tool other developers can use to build solutions. I think
> there is value in having Streams be a solution (or at least partial
> solution). If streams could collect activity data (whatever format), store
> it, aggregate it and provide analytics on that data as a package I think
> you've won. There could be another activity in the future to pull out some
> of the infrastructure code and make another project that was a generic
> processing platform that streams happened to use.
> 
> One other note is I think if storm is a requirement you are going to limit
> your customer base as well.
> 
> I will say though that I'm just providing my opinion. I'm not a committer
> or PMC member so I don't even really have a vote but as an outside
> observer, and someone whose seen projects succeed/fail, these are a few
> thoughts.
> 
> Chris
> 
> On Thu, Apr 17, 2014 at 8:27 PM, Steve Blackmon <steve@blackmon.org> wrote:
> 
> > Chris, I think you are right that the group should focus our efforts,
> > and that online activities (broadly defined) are the sweet spot.  I
> > just wouldn't want to give potential users or contributors the idea
> > that Streams is just for ActivityStreams - which I at least associate
> > with small data sets.  At least they look small viewed through Jira,
> > Jive, and similar tools.  Streams is also a big data processing engine
> > which can take advantage of the best features of storm or yarn while
> > significantly reducing the learning curve and code complexity of those
> > frameworks.
> 
> 
> > So long as the website makes it clear that activity data is a concept
> > and Streams can work regardless of how the data and metadata are
> > shaped, I'm cool with "Real-time Processing for Activity Data Streams"
> > as a tagline.
> >
> > Steve
> >
> > On Thu, Apr 17, 2014 at 8:04 PM, Chris Geer <chris@cxtsoftware.com> wrote:
> > > On Thu, Apr 17, 2014 at 9:32 AM, Steve Blackmon <steve@blackmon.org>
> > wrote:
> > >
> > >> >> Target audience is our potential users.  Technical in nature,
but it
> > >> still
> > >> >> needs to be succinct.
> > >> >>
> > >> >
> > >> > Ok, with that said, I think the tag-line should be more feature
> > focused
> > >> > because that can hook both the tech guys and business guys.
> > >>
> > >> Agreed
> > >>
> > >> > We also need to make careful just using the term "streams" because
> > >> really this isn't a
> > >> > generic stream processor (aka storm), our focus is on Activity
> > Streams.
> > >> > Maybe activity streams is a bad descriptor as well and Activity Data
> > >> might
> > >> > be better. "Real-time Processing for Activity Data Streams"???
> > >> >
> > >>
> > >> The engine actually doesn't care whether documents being processed are
> > >> activity-related or not:
> > >> any JVM object that jackson can serialize and deserialize work just
> > >> fine as datums.
> > >>
> > >> I think we can acknowledge that the community has a bias toward
> > >> ActivityStreams, but we shouldn't
> > >> downplay the flexibility Streams provides.  Focusing only on activity
> > >> data in project messaging
> > >> undercuts the fact that Streams is a powerful, flexible ESB/ETL
> > >> replacement.
> > >>
> > >
> > > My 2-cents for what it's worth. If we don't focus on a niche this won't
> > > take off. ESB/ETL systems are a dime-a-dozen and to be really good in
> > that
> > > space is a big endeavor. I'm not saying this system couldn't fill some of
> > > those needs but I think it's a bad idea to be that broad.
> > >
> > >>
> > >> >>
> > >> >>
> > >> >> >
> > >> >> > >
> > >> >> > > ?
> > >> >> > >
> > >> >> > > On Thu, Apr 17, 2014 at 8:26 AM, Matt Franklin <
> > >> >> m.ben.franklin@gmail.com
> > >> >> > >
> > >> >> > > wrote:
> > >> >> > > > On Mon, Apr 14, 2014 at 5:22 PM, Renato MarroquĂ­n
Mogrovejo <
> > >> >> > > > renatoj.marroquin@gmail.com> wrote:
> > >> >> > > >
> > >> >> > > >> Hi devs,
> > >> >> > > >>
> > >> >> > > >> Yeah the title was indeed compelling. You got
me on that one
> > lol
> > >> >> > > >> I think that you guys are right saying that
for attracting new
> > >> >> people
> > >> >> > > maybe
> > >> >> > > >> we should try making the project's goal something
more
> > >> applicable in
> > >> >> > > real
> > >> >> > > >> life than just being "a Lightweight server
for
> > ActivityStreams".
> > >> >> > > >> I liked the simple explanation I heard,maybe
it was the pisco
> > but
> > >> >> > please
> > >> >> > > >> correct me if I am wrong, "it's an abstraction
layer for
> > stream
> > >> >> > > processing
> > >> >> > > >> engines". IMHO we have two things defined:
> > >> >> > > >>
> > >> >> > > >> MISION:
> > >> >> > > >> 1)  A flexible data processing framework that
can run in
> > multiple
> > >> >> > > different
> > >> >> > > >> runtimes.  The goal being to abstract platform
complexity and
> > >> allow
> > >> >> > for
> > >> >> > > >> business logic reuse across real-time, enterprise,
web and
> > >> >> stand-alone
> > >> >> > > >> executions.
> > >> >> > > >>
> > >> >> > > >> This is what needs to be done.
> > >> >> > > >
> > >> >> > > >
> > >> >> > > >> VISION:
> > >> >> > > >> 2)  As a proving ground for the adoption of
data format
> > >> standards,
> > >> >> > > >> specifically ActivityStreams to start.  The
community would
> > work
> > >> to
> > >> >> > > drive
> > >> >> > > >> the adoption and evolution of such standards
through
> > real-world
> > >> >> > > experience.
> > >> >> > > >>
> > >> >> > > >> This is where we would like to get at some
time. But also to
> > get
> > >> >> more
> > >> >> > > >> community engaged, things have to simple. That
is a big issue
> > we
> > >> >> still
> > >> >> > > have
> > >> >> > > >> over in Gora, and we are trying to solve it
through talks,
> > better
> > >> >> > > >> tutorials, integration with other projects,
and so forth.
> > >> >> > > >> Just my 2cents guys.
> > >> >> > > >>
> > >> >> > > >
> > >> >> > > > So what is the tag line that sums up both the mission
and the
> > >> vision?
> > >> >> > > >
> > >> >> > > >
> > >> >> > > >>
> > >> >> > > >>
> > >> >> > > >> Renato M.
> > >> >> > > >>
> > >> >> > > >>
> > >> >> > > >> 2014-04-14 16:31 GMT+02:00 Matt Franklin <
> > >> m.ben.franklin@gmail.com
> > >> >> >:
> > >> >> > > >>
> > >> >> > > >> > On Fri, Apr 11, 2014 at 5:01 PM, Steve
Blackmon <
> > >> >> > sblackmon@apache.org
> > >> >> > > >> > >wrote:
> > >> >> > > >> >
> > >> >> > > >> > > On Thu, Apr 10, 2014 at 4:11 PM,
Matt Franklin <
> > >> >> > > >> m.ben.franklin@gmail.com
> > >> >> > > >> > >
> > >> >> > > >> > > wrote:
> > >> >> > > >> > > > tl;dr version:
> > >> >> > > >> > > >
> > >> >> > > >> > > > We need to discuss things on
the list more and work to
> > >> define
> > >> >> > > >> streams,
> > >> >> > > >> > > > update our public presence to
support this definition
> > and
> > >> >> > > encourage
> > >> >> > > >> > > > additional engagement.
> > >> >> > > >> > > >
> > >> >> > > >> > > +1, +1, +1
> > >> >> > > >> > >
> > >> >> > > >> > > > Long version:
> > >> >> > > >> > > >
> > >> >> > > >> > > > For those of you unaware, Steve
Blackmon gave a nice
> > talk
> > >> on
> > >> >> the
> > >> >> > > work
> > >> >> > > >> > he
> > >> >> > > >> > > > has been committing to Streams
at ApacheCon.  As part of
> > >> that
> > >> >> > talk
> > >> >> > > >> and
> > >> >> > > >> > > > follow on discussions, it became
clear that we as a
> > >> community
> > >> >> > > need to
> > >> >> > > >> > do
> > >> >> > > >> > > > some serious work to define
ourselves, what we are
> > building
> > >> >> and
> > >> >> > > why
> > >> >> > > >> it
> > >> >> > > >> > is
> > >> >> > > >> > > > valuable to the industry.
> > >> >> > > >> > > >
> > >> >> > > >> > > If anyone who missed the presentation
wants to see it, I'm
> > >> happy
> > >> >> > to
> > >> >> > > >> > > host a google hangout to run through
it.
> > >> >> > > >> > >
> > >> >> > > >> >
> > >> >> > > >> > Can you post it, or a link to it, on the
website too?
> > >> >> > > >> >
> > >> >> > > >> >
> > >> >> > > >> > >
> > >> >> > > >> > > > Our website says we are a Lightweight
server for
> > >> >> > ActivityStreams.
> > >> >> > > >> >  While
> > >> >> > > >> > > > this is true to some degree,
I think recent
> > contributions
> > >> >> should
> > >> >> > > >> refine
> > >> >> > > >> > > > this.  The new code is really
about supporting flexible
> > >> >> > > processing,
> > >> >> > > >> > > > persistence and retrieval of
data in multiple runtimes
> > >> using
> > >> >> > > strongly
> > >> >> > > >> > > > typed, normalized data formats
like ActivityStreams.
> > >> >> >  Personally,
> > >> >> > > I
> > >> >> > > >> > think
> > >> >> > > >> > > > this slightly new direction
is extremely compelling, and
> > >> the
> > >> >> > > reaction
> > >> >> > > >> > to
> > >> >> > > >> > > > Steve's talk seems to support
that.  The question
> > remains
> > >> how
> > >> >> > does
> > >> >> > > >> the
> > >> >> > > >> > > > community as a whole see the
project?  What value is
> > >> everyone
> > >> >> > > wanting
> > >> >> > > >> > to
> > >> >> > > >> > > > get out of this effort?
> > >> >> > > >> > > >
> > >> >> > > >> > > The session tag-line which attracted
~20 attendees was
> > >> >> > 'Simplifying
> > >> >> > > >> > > Real-Time data integration with Apache
Streams.' From
> > >> talking to
> > >> >> > > >> > > coders and data scientists I always
hear frustration with
> > how
> > >> >> much
> > >> >> > > >> > > time they spend writing code and
workflow to move bytes
> > >> around
> > >> >> and
> > >> >> > > >> > > keep track of their data assets.
I'd wager any survey of
> > >> >> prominent
> > >> >> > > >> > > open-source libraries and popular
commercial APIs would
> > have
> > >> to
> > >> >> > > >> > > conclude that schema and interface
standards are
> > completely
> > >> >> absent
> > >> >> > > >> > > or sparsely adopted at many layers.
> > >> >> > > >> > >
> > >> >> > > >> > > Standards in hardware, operating
systems, networks, and
> > >> >> relational
> > >> >> > > >> > > databases brought about flourishing
ecosystems. I believe
> > >> >> > standards
> > >> >> > > in
> > >> >> > > >> > > data interchange such as ActivityStreams
can do the same
> > for
> > >> the
> > >> >> > > >> > > social web, but not everyone will
embrace standards for
> > the
> > >> sake
> > >> >> > of
> > >> >> > > >> > > standards. If we can offer integration
points to the data
> > >> >> sources
> > >> >> > > and
> > >> >> > > >> > > repositories businesses want to work
with, and demonstrate
> > >> that
> > >> >> > > >> > > Streams can handle 'fire-hose' scale
data volumes with
> > >> >> arbitrarily
> > >> >> > > >> > > many intermediate hand-offs and processing
steps on
> > messages
> > >> in
> > >> >> > > >> > > flight, I think we will see adoption
from enterprises
> > >> looking to
> > >> >> > > >> > > replace ESB-type systems that can't
keep up with the
> > volume
> > >> of
> > >> >> > data
> > >> >> > > >> > > generated (both inside and outside
their networks) that
> > they
> > >> >> want
> > >> >> > to
> > >> >> > > >> > > track.  Streams is pretty decent
at ETL as well - a
> > function
> > >> >> that
> > >> >> > is
> > >> >> > > >> > > never going away, even as the underlying
tools best
> > suited to
> > >> >> > > >> > > performing it at scale constantly
change.
> > >> >> > > >> > >
> > >> >> > > >> > > This future-state I'm attempting
to describe will be a
> > better
> > >> >> one
> > >> >> > > for
> > >> >> > > >> > > researchers, hobbyists, entrepreneurs,
and consumers of
> > web
> > >> >> > products
> > >> >> > > >> > > and services.  Configuration-driven,
runtime-platform
> > >> agnostic,
> > >> >> > > >> > > software for real-time data exchange:
 where
> > community-driven
> > >> >> > > >> > > standards such as Activity Streams
can codify and evolve
> > >> >> > > >> > > best-practices via running code.
 That is a vision that I
> > >> think
> > >> >> > will
> > >> >> > > >> > > help us generate significant traction
going forward.
> > >> >> > > >> > >
> > >> >> > > >> >
> > >> >> > > >> > Just to make sure I am understanding you
correctly, you are
> > >> >> > proposing
> > >> >> > > we
> > >> >> > > >> > update the mission of the project to the
following:
> > >> >> > > >> >
> > >> >> > > >> > 1)  A flexible data processing framework
that can run in
> > >> multiple
> > >> >> > > >> different
> > >> >> > > >> > runtimes.  The goal being to abstract
platform complexity
> > and
> > >> >> allow
> > >> >> > > for
> > >> >> > > >> > business logic reuse across real-time,
enterprise, web and
> > >> >> > stand-alone
> > >> >> > > >> > executions.
> > >> >> > > >> > 2)  As a proving ground for the adoption
of data format
> > >> standards,
> > >> >> > > >> > specifically ActivityStreams to start.
 The community would
> > >> work
> > >> >> to
> > >> >> > > drive
> > >> >> > > >> > the adoption and evolution of such standards
through
> > real-world
> > >> >> > > >> experience.
> > >> >> > > >> >
> > >> >> > > >> > This sounds great, though it is slightly
different than the
> > >> >> > initially
> > >> >> > > >> > proposed functionality.  Personally, I
have no objection to
> > >> that,
> > >> >> as
> > >> >> > > what
> > >> >> > > >> > you describe encompasses the original
goals and expands on
> > >> them;
> > >> >> > but,
> > >> >> > > it
> > >> >> > > >> > would be good for the rest of the community
to weigh in.
> > >> >> > > >> >
> > >> >> > > >> >
> > >> >> > > >> > >
> > >> >> > > >> > > > The fact that there are not
clear answers (and
> > >> corresponding
> > >> >> > > >> documented
> > >> >> > > >> > > > statements on the website) to
these questions already
> > >> means we
> > >> >> > are
> > >> >> > > >> not
> > >> >> > > >> > > > doing a great job of following
the Apache Way.  The
> > Apache
> > >> Way
> > >> >> > is
> > >> >> > > >> about
> > >> >> > > >> > > the
> > >> >> > > >> > > > community and meritocratic,
community-based decision
> > >> making.
> > >> >> >  The
> > >> >> > > ASF
> > >> >> > > >> > > > defines it as follows:
> > >> >> > > >> > > >
> > >> >> > > >> > > > While there is not an official
list, these six
> > principles
> > >> have
> > >> >> > > been
> > >> >> > > >> > cited
> > >> >> > > >> > > > as the core beliefs of philosophy
behind the foundation,
> > >> which
> > >> >> > is
> > >> >> > > >> > > normally
> > >> >> > > >> > > > referred to as "The Apache Way":
> > >> >> > > >> > > >
> > >> >> > > >> > > > collaborative software development
> > >> >> > > >> > > >
> > >> >> > > >> > > > commercial-friendly standard
license
> > >> >> > > >> > > >
> > >> >> > > >> > > > consistently high quality software
> > >> >> > > >> > > >
> > >> >> > > >> > > > respectful, honest, technical-based
interaction
> > >> >> > > >> > > >
> > >> >> > > >> > > > faithful implementation of standards
> > >> >> > > >> > > >
> > >> >> > > >> > > > security as a mandatory feature
> > >> >> > > >> > > >
> > >> >> > > >> > > > All of the ASF projects share
these principles.
> > >> >> > > >> > > >
> > >> >> > > >> > > > Let's make sure we propose changes
to the list, create
> > >> tickets
> > >> >> > > that
> > >> >> > > >> > > support
> > >> >> > > >> > > > wider efforts and leverage principles
like lazy
> > consensus
> > >> to
> > >> >> > keep
> > >> >> > > >> > moving
> > >> >> > > >> > > > forward in a way that supports
the community.
> > >> >> > > >> > > +1, +1, +1
> > >> >> > > >> > >
> > >> >> > > >> > > On Thu, Apr 10, 2014 at 4:11 PM,
Matt Franklin <
> > >> >> > > >> m.ben.franklin@gmail.com
> > >> >> > > >> > >
> > >> >> > > >> > > wrote:
> > >> >> > > >> > > > tl;dr version:
> > >> >> > > >> > > >
> > >> >> > > >> > > > We need to discuss things on
the list more and work to
> > >> define
> > >> >> > > >> streams,
> > >> >> > > >> > > > update our public presence to
support this definition
> > and
> > >> >> > > encourage
> > >> >> > > >> > > > additional engagement.
> > >> >> > > >> > > >
> > >> >> > > >> > > > Long version:
> > >> >> > > >> > > >
> > >> >> > > >> > > > For those of you unaware, Steve
Blackmon gave a nice
> > talk
> > >> on
> > >> >> the
> > >> >> > > work
> > >> >> > > >> > he
> > >> >> > > >> > > > has been committing to Streams
at ApacheCon.  As part of
> > >> that
> > >> >> > talk
> > >> >> > > >> and
> > >> >> > > >> > > > follow on discussions, it became
clear that we as a
> > >> community
> > >> >> > > need to
> > >> >> > > >> > do
> > >> >> > > >> > > > some serious work to define
ourselves, what we are
> > building
> > >> >> and
> > >> >> > > why
> > >> >> > > >> it
> > >> >> > > >> > is
> > >> >> > > >> > > > valuable to the industry.
> > >> >> > > >> > > >
> > >> >> > > >> > > > Our website says we are a Lightweight
server for
> > >> >> > ActivityStreams.
> > >> >> > > >> >  While
> > >> >> > > >> > > > this is true to some degree,
I think recent
> > contributions
> > >> >> should
> > >> >> > > >> refine
> > >> >> > > >> > > > this.  The new code is really
about supporting flexible
> > >> >> > > processing,
> > >> >> > > >> > > > persistence and retrieval of
data in multiple runtimes
> > >> using
> > >> >> > > strongly
> > >> >> > > >> > > > typed, normalized data formats
like ActivityStreams.
> > >> >> >  Personally,
> > >> >> > > I
> > >> >> > > >> > think
> > >> >> > > >> > > > this slightly new direction
is extremely compelling, and
> > >> the
> > >> >> > > reaction
> > >> >> > > >> > to
> > >> >> > > >> > > > Steve's talk seems to support
that.  The question
> > remains
> > >> how
> > >> >> > does
> > >> >> > > >> the
> > >> >> > > >> > > > community as a whole see the
project?  What value is
> > >> everyone
> > >> >> > > wanting
> > >> >> > > >> > to
> > >> >> > > >> > > > get out of this effort?
> > >> >> > > >> > > >
> > >> >> > > >> > > > The fact that there are not
clear answers (and
> > >> corresponding
> > >> >> > > >> documented
> > >> >> > > >> > > > statements on the website) to
these questions already
> > >> means we
> > >> >> > are
> > >> >> > > >> not
> > >> >> > > >> > > > doing a great job of following
the Apache Way.  The
> > Apache
> > >> Way
> > >> >> > is
> > >> >> > > >> about
> > >> >> > > >> > > the
> > >> >> > > >> > > > community and meritocratic,
community-based decision
> > >> making.
> > >> >> >  The
> > >> >> > > ASF
> > >> >> > > >> > > > defines it as follows:
> > >> >> > > >> > > >
> > >> >> > > >> > > > While there is not an official
list, these six
> > principles
> > >> have
> > >> >> > > been
> > >> >> > > >> > cited
> > >> >> > > >> > > > as the core beliefs of philosophy
behind the foundation,
> > >> which
> > >> >> > is
> > >> >> > > >> > > normally
> > >> >> > > >> > > > referred to as "The Apache Way":
> > >> >> > > >> > > >
> > >> >> > > >> > > > collaborative software development
> > >> >> > > >> > > >
> > >> >> > > >> > > > commercial-friendly standard
license
> > >> >> > > >> > > >
> > >> >> > > >> > > > consistently high quality software
> > >> >> > > >> > > >
> > >> >> > > >> > > > respectful, honest, technical-based
interaction
> > >> >> > > >> > > >
> > >> >> > > >> > > > faithful implementation of standards
> > >> >> > > >> > > >
> > >> >> > > >> > > > security as a mandatory feature
> > >> >> > > >> > > >
> > >> >> > > >> > > > All of the ASF projects share
these principles.
> > >> >> > > >> > > >
> > >> >> > > >> > > > Let's make sure we propose changes
to the list, create
> > >> tickets
> > >> >> > > that
> > >> >> > > >> > > support
> > >> >> > > >> > > > wider efforts and leverage principles
like lazy
> > consensus
> > >> to
> > >> >> > keep
> > >> >> > > >> > moving
> > >> >> > > >> > > > forward in a way that supports
the community.
> > >> >> > > >> > >
> > >> >> > > >> > >
> > >> >> > > >> > >
> > >> >> > > >> > > --
> > >> >> > > >> > > Steve Blackmon
> > >> >> > > >> > > sblackmon@apache.org
> > >> >> > > >> > >
> > >> >> > > >> >
> > >> >> > > >>
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message