community-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rimon Chowdhury <>
Subject Re: Social Media Metrics using Apache stack
Date Tue, 27 Dec 2016 19:51:28 GMT
On Monday, November 21, 2016, sblackmon <> wrote:

> Hello ComDev,
> The Streams podling has been brainstorming ways to increase awareness of
> the project and it’s capabilities.  We’ve also been working to make it
> easier to get started as a user, without starting the journey by
> downloading JDK Maven and friends.  Using the software to provide benefit
> to the Foundation seems like a good thing to try.
> One use case for Streams is to build personal or organizational datasets
> of social media profiles and content for internal development and analysis,
> using the technologies and tools you and your organization prefer, rather
> than those provided by the upstream system.
> I took the liberty of creating a few Zeppelin notebooks which collect
> Apache project profiles and posts, normalize them to activity streams
> format, and interact with them using spark data frames.
> The notebooks are currently hosted in my zeppelinhub account, which anyone
> with the link below can access.
> bm90ZTovL3N0ZXZlYmxhY2ttb24vYXBhY2hlLXplcHBlbGluLWRhc2hib2Fy
> ZC84YjQ5YmY3MWIxYTU0ZTE2YjlkMDQyMTliMzNlMjQzYS9ub3RlLmpzb24
> bm90ZTovL3N0ZXZlYmxhY2ttb24vYXBhY2hlLXplcHBlbGluLWRhc2hib2Fy
> bm90ZTovL3N0ZXZlYmxhY2ttb24vYXBhY2hlLXplcHBlbGluLWRhc2hib2Fy
> ZC8zZmQ3M2Y1OWEzOGE0YmM2YjFkMGM4MzBkNTczZDU0Mi9ub3RlLmpzb24
> If this group sees potential benefit, I’d be happy to work to set them up
> for use by anyone at Apache in a dedicated Zeppelin deployment and take the
> lead on maintaining them going forward.
> In any case we’d appreciate any feedback on what could would make this
> prototype more valuable..
> Background on Streams:
> Apache Streams (incubating) unifies a diverse world of digital profiles
> and online activities into common formats and vocabularies, and makes these
> datasets accessible across a variety of databases, devices, and platforms
> for streaming, browsing, search, sharing, and analytics use-cases.
> Streams contains libraries and patterns for specifying, publishing, and
> inter-linking schemas, and assists with conversion of activities (posts,
> shares, likes, follows, etc.) and objects (profiles, pages, photos, videos,
> etc.) between the representation, format, and encoding preferred by
> supported data providers (Twitter, Instagram, etc.), and storage services
> (Cassandra, Elasticsearch, HBase, HDFS, Neo4J, etc.)
> In theory pretty much any JSON or XML API which uses a "look-up by ID and
> type” model can be co-erced into collections of activity-streams normalized
> profiles and posts - systems such as GitHub, JIRA, MeetUp could be added to
> the roadmap and have notebooks created once those providers are built.

Sent from Gmail Mobile

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message