streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Young <byo...@bigbluehat.com>
Subject Re: Granary & SocketHub
Date Thu, 20 Oct 2016 18:32:32 GMT
Great points, Steven.


What's always attracted me to Apache Streams is it's descriptiveness (via JSON Schemas documents)
vs. prescriptive-ness. Granary's approach is (currently? ;) ) more prescriptive:

https://github.com/snarfed/granary/blob/master/granary/twitter.py

vs.

https://github.com/apache/streams/tree/STREAMS-26/streams-contrib/streams-provider-twitter

...which is mostly (though not all) a collection of .json and .conf files with a handful of
.java files needed (afaict) for last-mile integration with one's tool.


The future I dream about is one where I can pick my tool for my idiosyncratic language, operating
system, license reasons, but they'll all work off shared, descriptive "knowledge" documents.


Otherwise, we're all pulling separately, and end up with snowflake systems to process snowflake
APIs. However, I also know it's unlikely everyone will come "under one roof" to work on things.
My hope, though, is that the output of this group (and Granary and Sockethub and...) will
be re-usable by as wide an audience as possible--hence the value of description over prescription
(at least in my book ;) ).


Granted, if I'm barking up the wrong tree (again), I'm happy to wander off...


Is anything in the above sane? ;)


Cheers!

Benjamin

--

http://bigbluehat.com/

http://linkedin.com/in/benjaminyoung

________________________________
From: sblackmon <sblackmon@apache.org>
Sent: Thursday, October 20, 2016 1:26:38 PM
To: dev@streams.incubator.apache.org
Cc: Matt Franklin; Benjamin Young
Subject: Re: Granary & SocketHub

On October 18, 2016 at 6:09:49 PM, Matt Franklin (m.ben.franklin@gmail.com<mailto:m.ben.franklin@gmail.com>)
wrote:
On Tue, Oct 18, 2016 at 10:39 AM Benjamin Young <byoung@bigbluehat.com>
wrote:

> (resending from the correct account...likely the other got spammed...)
>
> Granary is a project with similar ideas and intents as Apache Streams
> (which also needs AS2 support ;) ):
> https://github.com/snarfed/granary
>

Ryan from Granary is on the list I think.  Hey Ryan!  Cool stuff, too bad it's python :)

> In fact Apache Streams gets a mention in their "Related Work" section:
> https://github.com/snarfed/granary#related-work
>
> Also mentioned in the Granary related work section is SocketHub:
> https://github.com/sockethub/sockethub
>

Cool stuff, too bad it's LGPL :)

> It's aims are similar, but it's reaching way beyond Web-based social APIs
> and "back" to including things like IRC, Email, etc.


Non-SNS data sources are important for sure. I've posted some work on my personal github using
the streams framework to parse MBOX files - https://github.com/steveblackmon/streams-apache
- and to collect quantified self data - https://github.com/steveblackmon/humanapi-streams
IRC is interesting as well.


> What's significant about both these projects (and others they link to) are
> the stories they're telling developers-which we can crib from as we think
> about the Streams "pitch." They also have relatively minimal setup
> docs-which Streams is also heading toward (go Steve!).
>

Agreed this is key


The existence of other open-source projects with similar themes suggests we're onto an important
problem.  We should pay attention to these projects and what is working for them WRT user
growth, community growth, tech media coverage, etc...


>
> Again, my key objective is to understand the Apache Streams vision along
> side projects like these and within the wider space of consolidating social
> data. What market does it serve? Is it "personal" (as these projects seem
> to be)? Or commercial? Or developer-only (library/framework for wiring up
> your own idiosyncratic stuff)?
>

I think the overall objective of streams remains very similar to what it
started as: A way to easily and flexibly ingest multiple different sources
of 'activity' data in a normalized ActivityStreams format. For me
personally, my interest is in ingesting this data at scale and with as
little internally-maintained code as possible.

While most of the development so far has been geared toward enabling back-end / commercial-scale
data collection and management, I think the future should be more about enabling individuals
and businesses to transcend data silos using computing resources and code entirely under their
own control. This might mean supporting regular users with a full-featured SaaS application
in addition to continued work on data interoperability.


>
> Thanks for reading, pondering, and helping me help. :)
>
> Cheers!
> Benjamin
>
> --
> http://bigbluehat.com/
> http://linkedin.com/in/benjaminyoung
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message