incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Atri Sharma <atri.j...@gmail.com>
Subject Re: [DISCUSS] S2Graph Incubator Proposal
Date Fri, 13 Nov 2015 04:55:52 GMT
Really happy to see this proposal.

I am glad to help in any way I can.
On 13 Nov 2015 10:25, "Hyunsik Choi" <hyunsik@apache.org> wrote:

> @Sergio,
> I totally agree with you.
>
> @Luke Han,
> I put your name on the initial committer list.
>
> I'll call a vote for incubation within few days.
>
> Best regards,
> Hyunsik
>
> On Thu, Nov 12, 2015 at 12:45 AM, Luke Han <luke.hq@gmail.com> wrote:
> > Hi Hyunsik,
> >     I'm happy to help, GraphDB is interesting to me to analysis
> qualitative
> > data beyond quantitative data:)
> >
> >     Thanks.
> > Luke
> >
> >
> > Best Regards!
> > ---------------------
> >
> > Luke Han
> >
> > On Thu, Nov 12, 2015 at 9:47 AM, Hyunsik Choi <hyunsik@apache.org>
> wrote:
> >
> >> Hi Luke,
> >>
> >> Thank you for your interest in S2Graph project. If you don't mind,
> >> we'd like to add you to the initial committer list. I think that your
> >> experience and skills about HBase would be very helpful to S2Graph
> >> project.
> >>
> >> Best regards,
> >> Hyunsik
> >>
> >> On Mon, Nov 9, 2015 at 6:58 PM, Luke Han <luke.hq@gmail.com> wrote:
> >> > I'm very interesting about this project, would love to help but I'm
> not
> >> > IPMC member.
> >> >
> >> > Please let me know if there's anything I could help on.
> >> >
> >> > Thanks.
> >> >
> >> >
> >> > Best Regards!
> >> > ---------------------
> >> >
> >> > Luke Han
> >> >
> >> > On Tue, Nov 10, 2015 at 9:03 AM, Hyunsik Choi <hyunsik@apache.org>
> >> wrote:
> >> >
> >> >> Hi Seetharam,
> >> >>
> >> >> Thank you for your volunteering! I've added your name to the mentor
> >> list.
> >> >>
> >> >> I also updated the initial committer list and affiliations via google
> >> >> search.
> >> >> If I wrote wrong affiliations, please let me know.
> >> >>
> >> >> Best regards,
> >> >> Hyunsik
> >> >>
> >> >>
> >> >> On Mon, Nov 9, 2015 at 4:21 PM, Seetharam Venkatesh
> >> >> <venkatesh@innerzeal.com> wrote:
> >> >> > Hi Hyunsik,
> >> >> >
> >> >> > If you are still looking for mentors, let me volunteer as one.
> >> >> >
> >> >> > Thanks!
> >> >> >
> >> >> > On Mon, Nov 9, 2015 at 3:45 PM Hyunsik Choi <hyunsik@apache.org>
> >> wrote:
> >> >> >
> >> >> >> Thank you all guys  I just put you names on the nominated
mentor
> >> list.
> >> >> >>
> >> >> >> @Andrew,
> >> >> >>
> >> >> >> I agree with you. S2Graph already has good relationships with
> other
> >> >> >> ASF projects, such as HBase and Spark,  In addition, they
have a
> plan
> >> >> >> to expand its relationship to Apache incubator TinkerPop,
which
> is a
> >> >> >> graph computing framework. I'm looking forward to their
> combinations.
> >> >> >>
> >> >> >> @Sergio,
> >> >> >>
> >> >> >> Thank you for attending the talk and joining the S2Graph mentors.
> >> That
> >> >> >> was Doyung Yoon, one of the S2Graph creators. He had a talk
at the
> >> >> >> last ApacheCon.
> >> >> >>
> >> >> >> On Mon, Nov 9, 2015 at 11:58 AM, Sergio Fernández <
> wikier@apache.org
> >> >
> >> >> >> wrote:
> >> >> >> > Hi Hyunsik, I attended your talk at the last ApacheCon,
and I
> >> think S2
> >> >> >> has
> >> >> >> > quite some potential. So if you need a mentor, count
me in!
> >> >> >> >
> >> >> >> > On Mon, Nov 9, 2015 at 7:54 PM, Hyunsik Choi <
> hyunsik@apache.org>
> >> >> wrote:
> >> >> >> >
> >> >> >> >> This project is looking for mentors. Anyone can help?
We are
> also
> >> >> >> >> looking forward to any feedback.
> >> >> >> >>
> >> >> >> >> Also, I attached the proposal here. I forgot it.
> >> >> >> >>
> >> >> >> >> ----------------
> >> >> >> >>
> >> >> >> >> = S2Graph Proposal =
> >> >> >> >>
> >> >> >> >> == Abstract ==
> >> >> >> >> S2Graph is a distributed and scalable OLTP graph
database
> built on
> >> >> >> >> HBase to support fast traversal on extremely large
graph.
> >> >> >> >>
> >> >> >> >> Here are additional materials to introduce S2Graph.
> >> >> >> >>  * HBaseCon 2015 -
> >> >> >> http://www.slideshare.net/HBaseCon/use-cases-session-5
> >> >> >> >>  * Apache: Big Data 2015 -
> >> >> >> >>
> >> >>
> http://schd.ws/hosted_files/apachebigdata2015/06/s2graph_apache_con.pdf
> >> >> >> >>
> >> >> >> >> == Proposal ==
> >> >> >> >> S2Graph is to provide a scalable distributed graph
database
> engine
> >> >> >> >> over key/value storage such as HBase. S2Graph provide
fully
> >> >> >> >> ashynchronous API to manupulate data as property
graph model
> and
> >> fast
> >> >> >> >> breadth first search query on graph.
> >> >> >> >>
> >> >> >> >> == Background ==
> >> >> >> >> S2Graph initially started as an internal project
at Kakao.com
> to
> >> >> >> >> efficiently store user relation and user activities
as one
> large
> >> >> graph
> >> >> >> >> and provide unified query to traverse graph. It was
open
> sourced
> >> on
> >> >> >> >> Github about a 3 months ago in June 2015.
> >> >> >> >>
> >> >> >> >> Over time S2Graph, together with HBase as storage
tier, has
> begun
> >> to
> >> >> >> >> be adapted into various applications, such as messaging,
social
> >> >> feeds,
> >> >> >> >> realtime recommendations at Kakao.
> >> >> >> >>
> >> >> >> >> Users can benefit from S2Graph`s generalized high
level API
> >> instead
> >> >> of
> >> >> >> >> low-level key/value API for graph abstraction, just
like
> Phoenix
> >> >> >> >> provide SQL layer over HBase.
> >> >> >> >>
> >> >> >> >> == Rationale ==
> >> >> >> >> Graph data(highly interconnected data) is very abundant
and
> >> important
> >> >> >> >> these days.
> >> >> >> >> When users have a multitude of relationships, each
with complex
> >> >> >> >> properties associated with them, graph model is more
intuitive
> and
> >> >> >> >> efficient than tabular format(RDBMS).
> >> >> >> >> There are many ASF projects that provide SQL layer,
but there
> is
> >> no
> >> >> >> >> ASF projects that provide scalable graph layer on
existing
> hadoop
> >> >> echo
> >> >> >> >> system.
> >> >> >> >> When graph data grows to trillion edge scale, the
process of
> >> >> >> >> traversing takes a long time and costly. However,
with the
> >> benefit of
> >> >> >> >> HBase`s scalable architecture, S2Graph can traverse
large
> graph in
> >> >> >> >> breadth first search manner efficiently.
> >> >> >> >>
> >> >> >> >> S2Graph also interoperates with several existing
Apache
> >> >> >> >> projects(HBase, Spark) to provide way to merge real
time events
> >> and
> >> >> >> >> batch processed data using property graph data model.
> >> >> >> >>
> >> >> >> >> Many developers are running their own domain specific
API
> servers
> >> to
> >> >> >> >> serve their data products, but graph model is general
and
> S2Graph
> >> API
> >> >> >> >> fully support traverse on graph, so it can be used
as scalable
> >> >> general
> >> >> >> >> purpose API serving layer for various domains.
> >> >> >> >> As long as data can be modeled as graph, then users
can avoid
> >> tedious
> >> >> >> >> work for developing customized API servers by using
S2Graph.
> >> >> >> >>
> >> >> >> >> == Initial Goals ==
> >> >> >> >> The initial goals will be to move the existing codebase
to
> Apache
> >> and
> >> >> >> >> integrate with the Apache development process. Once
this is
> >> >> >> >> accomplished, we plan for incremental development
and releases
> >> that
> >> >> >> >> follow the Apache guidelines.
> >> >> >> >>
> >> >> >> >> == Current Status ==
> >> >> >> >>
> >> >> >> >> === Meritocracy ===
> >> >> >> >> S2Graph operated on meritocratic principles from
the get go.
> >> >> >> >> Currently, all the discussions pertaining to S2Graph
> development
> >> are
> >> >> >> >> public on Github. The current incubation
> >> >> >> >> proposal includes the major code contributors to
S2Graph.
> Several
> >> >> >> >> additional people have worked on the S2graph codebase
for
> industry
> >> >> use
> >> >> >> >> cases and would be interested in becoming committers.
We are
> >> starting
> >> >> >> >> with a small committer group and we plan to add additional
> >> committers
> >> >> >> >> following an open merit-based decision process during
the
> >> incubation
> >> >> >> >> phase.
> >> >> >> >>
> >> >> >> >> === Community ===
> >> >> >> >> We have already begun building a community but at
this time the
> >> >> >> >> community consists only of S2Graph developers –
all Kakao
> >> employees –
> >> >> >> >> and prospective users.
> >> >> >> >> S2Graph seeks to develop developer and user communities
during
> >> >> >> incubation.
> >> >> >> >>
> >> >> >> >> === Core Developers ===
> >> >> >> >> S2Graph is currently being designed and developed
by 2
> engineers
> >> from
> >> >> >> >> Kakao. - Doyung Yoon, Deawon Jeong.
> >> >> >> >>
> >> >> >> >> === Alignment ===
> >> >> >> >> Our proposed S2Graph effort aligns closely with Apache
HBase.
> The
> >> >> >> >> HBase project perimeter is denoted by a simple byte-array
based
> >> >> >> >> Create, Read, Update, Delete and Scan APIs with no
current
> plans
> >> to
> >> >> >> >> extend beyond this bounds.
> >> >> >> >>
> >> >> >> >> S2Graph complements this with a higher level API
for property
> >> graph
> >> >> >> model.
> >> >> >> >>
> >> >> >> >> S2Graph was designed to offer scalable distributed
graph
> database
> >> >> skin
> >> >> >> >> over HBase from the beginning in order to provide
property
> graph
> >> >> model
> >> >> >> >> and breadth first search, and continue to focus on
providing
> graph
> >> >> >> >> model.
> >> >> >> >>
> >> >> >> >> == Known Risks ==
> >> >> >> >> === Orphaned Products ===
> >> >> >> >> The core developers of S2Graph team plan to work
full time on
> this
> >> >> >> >> project. There is very little risk of S2Graph getting
orphaned
> >> since
> >> >> >> >> at least one large company (Kakao) is extensively
using it in
> >> their
> >> >> >> >> production HBase clusters. For example, currently
there are 20+
> >> use
> >> >> >> >> cases with more than 1+Trillion edges and 140 million
breadth
> >> first
> >> >> >> >> search query requests per minute using S2Graph in
production.
> >> >> >> >> We plan to extend and diversify this community further
through
> >> >> Apache.
> >> >> >> >>
> >> >> >> >> === Inexperience with Open Source ===
> >> >> >> >> The core developers are all active users and followers
of open
> >> >> source.
> >> >> >> >> They are already committers and contributors to the
S2Graph
> Github
> >> >> >> >> project. All have been involved with the source code
that has
> been
> >> >> >> >> released under an open source license. Though the
core set of
> >> >> >> >> Developers do not have Apache Open Source experience,
there are
> >> plans
> >> >> >> >> to onboard individuals with Apache open source experience
on to
> >> the
> >> >> >> >> project.
> >> >> >> >>
> >> >> >> >> === Homogenous Developers ===
> >> >> >> >> Most committers in this proposal belong to the same
institution
> >> >> >> >> (Kakao). The engagement of these committers goes
well beyond
> the
> >> >> >> >> necessary development to support research, and all
committers
> >> work on
> >> >> >> >> S2Graph full time.
> >> >> >> >> Several people from other institutions are working
on and are
> >> >> familiar
> >> >> >> >> with the S2Graph codebase. We will work to attract
them as
> future
> >> >> >> >> committers during the incubation phase, following
a merit-based
> >> >> >> >> approach.
> >> >> >> >>
> >> >> >> >> === Reliance on Salaried Developers ===
> >> >> >> >> Kakao invested in S2Graph as the distributed graph
database
> >> solution
> >> >> >> >> on top of HBase and some of its key engineers are
working full
> >> time
> >> >> on
> >> >> >> >> the project.
> >> >> >> >> We look forward to other Apache developers and researchers
to
> >> >> >> >> contribute to the project.
> >> >> >> >> Also key to addressing the risk associated with relying
on
> >> Salaried
> >> >> >> >> developers from a single entity is to increase the
diversity of
> >> the
> >> >> >> >> contributors and actively lobby for Domain experts
in the graph
> >> >> >> >> database space to contribute. Apache S2Graph intends
to do
> this.
> >> >> >> >>
> >> >> >> >> === Relationships with Other Apache Products ===
> >> >> >> >> S2Graph has a strong relationship and dependency
with Apache
> >> Hadoop
> >> >> >> >> HBase and Spark.
> >> >> >> >> Being part of Apache’s Incubation community, could
help with a
> >> closer
> >> >> >> >> collaboration among these two projects and as well
as others.
> >> >> >> >>
> >> >> >> >> In terms of graph processing frameworks, S2Graph
and Apache
> Giraph
> >> >> >> >> look similar. However, their goals are apparently
different to
> >> each
> >> >> >> >> other. Giraph aims at analytical batch processing
on immutable
> >> graph
> >> >> >> >> data sets. In contrast, S2Graph is designed for OLTP-like
> >> workloads
> >> >> on
> >> >> >> >> graph data sets, and S2Graph provides INSERT/UPDATE
operations
> >> too.
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> === An Excessive Fascination with the Apache Brand
===
> >> >> >> >> S2Graph is proposing to enter incubation at Apache
in order to
> >> help
> >> >> >> >> efforts to diversify the committer-base, not so much
to
> >> capitalize on
> >> >> >> >> the Apache brand. The S2Graph project is in production
use
> already
> >> >> >> >> inside Kakao, but is not expected to be an Kakao
product for
> >> external
> >> >> >> >> customers. As such, the S2Graph project is not seeking
to use
> the
> >> >> >> >> Apache brand as a marketing tool.
> >> >> >> >>
> >> >> >> >> == Documentation ==
> >> >> >> >> Information about S2Graph can be found at
> >> >> >> >> https://github.com/kakao/s2graph. The following links
provide
> >> more
> >> >> >> >> information about S2Graph in open source:
> >> >> >> >>  * S2Graph web site:
> >> >> >> https://steamshon.gitbooks.io/s2graph-book/content/
> >> >> >> >>  * Codebase at Github: https://github.com/kakao/s2graph
> >> >> >> >>  * Issue Tracking: https://github.com/kakao/s2graph/issues
> >> >> >> >>  * User community:
> >> https://groups.google.com/forum/#!forum/s2graph
> >> >> >> >>
> >> >> >> >> == Initial Source ==
> >> >> >> >>
> >> >> >> >> The S2Graph codebase is currently hosted on Github:
> >> >> >> >> https://github.com/kakao/s2graph
> >> >> >> >>
> >> >> >> >> === Source and Intellectual Property Submission Plan
===
> >> >> >> >>
> >> >> >> >> Currently, the S2Graph codebase is distributed under
the Apache
> >> 2.0
> >> >> >> >> License.
> >> >> >> >>
> >> >> >> >> == External Dependencies ==
> >> >> >> >>
> >> >> >> >> Beyond relying on Apache HBase, Phoenix has the following
> external
> >> >> >> >> dependencies:
> >> >> >> >>  * Asynchbase (BSD license: http://www.antlr3.org/license.html
> )
> >> >> >> >>  * Mysql (BSD license:
> >> >> >> >> https://github.com/julianhyde/sqlline/blob/master/LICENSE)
> >> >> >> >>  * Play Framework (Apache 2.0 license:
> >> >> >> >> https://github.com/playframework/playframework)
> >> >> >> >>  * Scala (https://github.com/scala/scala)
> >> >> >> >>  * Spark
> >> >> >> >>  * Kafka
> >> >> >> >>
> >> >> >> >> == Required Resources ==
> >> >> >> >>
> >> >> >> >> === Mailing list ===
> >> >> >> >>
> >> >> >> >> We will migrate our mailing lists to the following:
> >> >> >> >>  * users@s2graph.incubator.apache.org
> >> >> >> >>  * dev@s2graph.incubator.apache.org
> >> >> >> >>  * private@s2graph.incubator.apache.org
> >> >> >> >>  * commits@s2graph.incubator.apache.org
> >> >> >> >>
> >> >> >> >> === Source control ===
> >> >> >> >>
> >> >> >> >> The S2Graph team would like to use Git for source
control, due
> to
> >> our
> >> >> >> >> current use of Git. We request a writeable Git repo
for
> S2Graph,
> >> and
> >> >> >> >> mirroring to be set up to Github through INFRA.
> >> >> >> >>
> >> >> >> >> === Issue Tracking ===
> >> >> >> >>
> >> >> >> >> S2Graph currently uses the github issue tracking
system
> associated
> >> >> >> >> with its github repo: https://github.com/kakao/s2graph/issues.
> We
> >> >> will
> >> >> >> >> migrate to the Apache JIRA:
> >> >> >> >> http://issues.apache.org/jira/browse/S2Graph
> >> >> >> >>
> >> >> >> >> === Other Resources ===
> >> >> >> >>
> >> >> >> >> Jenkins/Hudson for builds and test running.
> >> >> >> >> Wiki for documentation purposes
> >> >> >> >> Blog to improve project dissemination
> >> >> >> >>
> >> >> >> >> == Initial Committers ==
> >> >> >> >>
> >> >> >> >>  * Doyung Yoon <shom83 at gmail.com>
> >> >> >> >>  * Daewon Jeong <blueiur at gmail.com>
> >> >> >> >>  * Jaesang Kim <honeysleep at gmail.com>
> >> >> >> >>  * Hwansung Yu <deejayfwan at gmail.com>
> >> >> >> >>  * Min-Seok Kim <mskim.org at gmail.com>
> >> >> >> >>  * Chul Kang <miralchul at gmail.com>
> >> >> >> >>
> >> >> >> >> == Affiliations ==
> >> >> >> >>
> >> >> >> >> The initial committers are from one organizations:
Kakao.
> >> >> >> >>  * Doyung Yoon, Kakao
> >> >> >> >>  * Daewon Jeong, Kakao
> >> >> >> >>  * Jaesang Kim, Kakao
> >> >> >> >>  * Hwansung Yu, Kakao
> >> >> >> >>  * Min-Seok Kim, Kakao
> >> >> >> >>  * Chul Kang, Kakao
> >> >> >> >>
> >> >> >> >> == Sponsors ==
> >> >> >> >>
> >> >> >> >> === Champion ===
> >> >> >> >> Hyunsik Choi
> >> >> >> >>
> >> >> >> >> === Nominated Mentors ===
> >> >> >> >>
> >> >> >> >> === Sponsoring Entity ===
> >> >> >> >>
> >> >> >> >>  * The Apache Incubator
> >> >> >> >>
> >> >> >> >> On Fri, Nov 6, 2015 at 4:05 PM, Hyunsik Choi <
> hyunsik@apache.org>
> >> >> >> wrote:
> >> >> >> >> > Hi Seetharam,
> >> >> >> >> >
> >> >> >> >> > Thank you for a good question. That seem to
be a frequent
> >> question
> >> >> to
> >> >> >> >> > this project.
> >> >> >> >> >
> >> >> >> >> > Here is the answer to your question.
> >> >> >> >> >
> >> >> >> >>
> >> >> >>
> >> >>
> >>
> https://steamshon.gitbooks.io/s2graph-book/content/what_is_different_to_titan.html
> >> >> >> >> >
> >> >> >> >> > I hope that this link is helpful to your understanding.
> >> >> >> >> >
> >> >> >> >> > Best regards,
> >> >> >> >> > Hyunsik
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > On Fri, Nov 6, 2015 at 3:07 PM, Seetharam Venkatesh
> >> >> >> >> > <venkatesh@innerzeal.com> wrote:
> >> >> >> >> >> Hi Hyunsik,
> >> >> >> >> >>
> >> >> >> >> >> The proposal looks interesting and want
to know how is this
> >> >> different
> >> >> >> >> from
> >> >> >> >> >> existing solutions in the same space such
as Titan, etc.
> >> >> >> >> >>
> >> >> >> >> >> Thanks!
> >> >> >> >> >> Venkatesh
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >> On Fri, Nov 6, 2015 at 1:36 PM Hyunsik Choi
<
> >> hyunsik@apache.org>
> >> >> >> wrote:
> >> >> >> >> >>
> >> >> >> >> >>> Hi folks,
> >> >> >> >> >>>
> >> >> >> >> >>> We would like to start a discussion
on S2Graph as an
> >> incubation
> >> >> >> >> project.
> >> >> >> >> >>>
> >> >> >> >> >>> S2Graph is a distributed and scalable
OLTP graph database
> >> built
> >> >> on
> >> >> >> >> >>> HBase. It provides interactive queries
for
> >> >> vertex/edge/sub-graphs on
> >> >> >> >> >>> extremely large graph data sets as well
as insertion and
> >> update
> >> >> >> >> >>> operations.
> >> >> >> >> >>>
> >> >> >> >> >>> S2Graph was already introduced in Apache
BigData and
> HBaseCon
> >> >> this
> >> >> >> >> year.
> >> >> >> >> >>>
> >> >> >> >> >>> The proposal is available at :
> >> >> >> >> >>> https://wiki.apache.org/incubator/S2GraphProposal
> >> >> >> >> >>>
> >> >> >> >> >>> We are looking forward to any feedback.
In addition, we are
> >> >> looking
> >> >> >> >> >>> for volunteers as mentors.
> >> >> >> >> >>>
> >> >> >> >> >>> Best regards,
> >> >> >> >> >>> Hyunsik
> >> >> >> >> >>>
> >> >> >> >> >>>
> >> >> >>
> ---------------------------------------------------------------------
> >> >> >> >> >>> To unsubscribe, e-mail:
> >> general-unsubscribe@incubator.apache.org
> >> >> >> >> >>> For additional commands, e-mail:
> >> >> general-help@incubator.apache.org
> >> >> >> >> >>>
> >> >> >> >> >>>
> >> >> >> >>
> >> >> >> >>
> >> ---------------------------------------------------------------------
> >> >> >> >> To unsubscribe, e-mail:
> general-unsubscribe@incubator.apache.org
> >> >> >> >> For additional commands, e-mail:
> >> general-help@incubator.apache.org
> >> >> >> >>
> >> >> >> >>
> >> >> >> >
> >> >> >> >
> >> >> >> > --
> >> >> >> > Sergio Fernández
> >> >> >> > Partner Technology Manager
> >> >> >> > Redlink GmbH
> >> >> >> > m: +43 6602747925
> >> >> >> > e: sergio.fernandez@redlink.co
> >> >> >> > w: http://redlink.co
> >> >> >>
> >> >> >>
> ---------------------------------------------------------------------
> >> >> >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> >> >> >> For additional commands, e-mail:
> general-help@incubator.apache.org
> >> >> >>
> >> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> >> >> For additional commands, e-mail: general-help@incubator.apache.org
> >> >>
> >> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> >> For additional commands, e-mail: general-help@incubator.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message