incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyunsik Choi <hyun...@apache.org>
Subject Re: [DISCUSS] S2Graph Incubator Proposal
Date Fri, 13 Nov 2015 04:54:57 GMT
@Sergio,
I totally agree with you.

@Luke Han,
I put your name on the initial committer list.

I'll call a vote for incubation within few days.

Best regards,
Hyunsik

On Thu, Nov 12, 2015 at 12:45 AM, Luke Han <luke.hq@gmail.com> wrote:
> Hi Hyunsik,
>     I'm happy to help, GraphDB is interesting to me to analysis qualitative
> data beyond quantitative data:)
>
>     Thanks.
> Luke
>
>
> Best Regards!
> ---------------------
>
> Luke Han
>
> On Thu, Nov 12, 2015 at 9:47 AM, Hyunsik Choi <hyunsik@apache.org> wrote:
>
>> Hi Luke,
>>
>> Thank you for your interest in S2Graph project. If you don't mind,
>> we'd like to add you to the initial committer list. I think that your
>> experience and skills about HBase would be very helpful to S2Graph
>> project.
>>
>> Best regards,
>> Hyunsik
>>
>> On Mon, Nov 9, 2015 at 6:58 PM, Luke Han <luke.hq@gmail.com> wrote:
>> > I'm very interesting about this project, would love to help but I'm not
>> > IPMC member.
>> >
>> > Please let me know if there's anything I could help on.
>> >
>> > Thanks.
>> >
>> >
>> > Best Regards!
>> > ---------------------
>> >
>> > Luke Han
>> >
>> > On Tue, Nov 10, 2015 at 9:03 AM, Hyunsik Choi <hyunsik@apache.org>
>> wrote:
>> >
>> >> Hi Seetharam,
>> >>
>> >> Thank you for your volunteering! I've added your name to the mentor
>> list.
>> >>
>> >> I also updated the initial committer list and affiliations via google
>> >> search.
>> >> If I wrote wrong affiliations, please let me know.
>> >>
>> >> Best regards,
>> >> Hyunsik
>> >>
>> >>
>> >> On Mon, Nov 9, 2015 at 4:21 PM, Seetharam Venkatesh
>> >> <venkatesh@innerzeal.com> wrote:
>> >> > Hi Hyunsik,
>> >> >
>> >> > If you are still looking for mentors, let me volunteer as one.
>> >> >
>> >> > Thanks!
>> >> >
>> >> > On Mon, Nov 9, 2015 at 3:45 PM Hyunsik Choi <hyunsik@apache.org>
>> wrote:
>> >> >
>> >> >> Thank you all guys  I just put you names on the nominated mentor
>> list.
>> >> >>
>> >> >> @Andrew,
>> >> >>
>> >> >> I agree with you. S2Graph already has good relationships with other
>> >> >> ASF projects, such as HBase and Spark,  In addition, they have
a plan
>> >> >> to expand its relationship to Apache incubator TinkerPop, which
is a
>> >> >> graph computing framework. I'm looking forward to their combinations.
>> >> >>
>> >> >> @Sergio,
>> >> >>
>> >> >> Thank you for attending the talk and joining the S2Graph mentors.
>> That
>> >> >> was Doyung Yoon, one of the S2Graph creators. He had a talk at
the
>> >> >> last ApacheCon.
>> >> >>
>> >> >> On Mon, Nov 9, 2015 at 11:58 AM, Sergio Fernández <wikier@apache.org
>> >
>> >> >> wrote:
>> >> >> > Hi Hyunsik, I attended your talk at the last ApacheCon, and
I
>> think S2
>> >> >> has
>> >> >> > quite some potential. So if you need a mentor, count me in!
>> >> >> >
>> >> >> > On Mon, Nov 9, 2015 at 7:54 PM, Hyunsik Choi <hyunsik@apache.org>
>> >> wrote:
>> >> >> >
>> >> >> >> This project is looking for mentors. Anyone can help?
We are also
>> >> >> >> looking forward to any feedback.
>> >> >> >>
>> >> >> >> Also, I attached the proposal here. I forgot it.
>> >> >> >>
>> >> >> >> ----------------
>> >> >> >>
>> >> >> >> = S2Graph Proposal =
>> >> >> >>
>> >> >> >> == Abstract ==
>> >> >> >> S2Graph is a distributed and scalable OLTP graph database
built on
>> >> >> >> HBase to support fast traversal on extremely large graph.
>> >> >> >>
>> >> >> >> Here are additional materials to introduce S2Graph.
>> >> >> >>  * HBaseCon 2015 -
>> >> >> http://www.slideshare.net/HBaseCon/use-cases-session-5
>> >> >> >>  * Apache: Big Data 2015 -
>> >> >> >>
>> >> http://schd.ws/hosted_files/apachebigdata2015/06/s2graph_apache_con.pdf
>> >> >> >>
>> >> >> >> == Proposal ==
>> >> >> >> S2Graph is to provide a scalable distributed graph database
engine
>> >> >> >> over key/value storage such as HBase. S2Graph provide
fully
>> >> >> >> ashynchronous API to manupulate data as property graph
model and
>> fast
>> >> >> >> breadth first search query on graph.
>> >> >> >>
>> >> >> >> == Background ==
>> >> >> >> S2Graph initially started as an internal project at Kakao.com
to
>> >> >> >> efficiently store user relation and user activities as
one large
>> >> graph
>> >> >> >> and provide unified query to traverse graph. It was open
sourced
>> on
>> >> >> >> Github about a 3 months ago in June 2015.
>> >> >> >>
>> >> >> >> Over time S2Graph, together with HBase as storage tier,
has begun
>> to
>> >> >> >> be adapted into various applications, such as messaging,
social
>> >> feeds,
>> >> >> >> realtime recommendations at Kakao.
>> >> >> >>
>> >> >> >> Users can benefit from S2Graph`s generalized high level
API
>> instead
>> >> of
>> >> >> >> low-level key/value API for graph abstraction, just like
Phoenix
>> >> >> >> provide SQL layer over HBase.
>> >> >> >>
>> >> >> >> == Rationale ==
>> >> >> >> Graph data(highly interconnected data) is very abundant
and
>> important
>> >> >> >> these days.
>> >> >> >> When users have a multitude of relationships, each with
complex
>> >> >> >> properties associated with them, graph model is more intuitive
and
>> >> >> >> efficient than tabular format(RDBMS).
>> >> >> >> There are many ASF projects that provide SQL layer, but
there is
>> no
>> >> >> >> ASF projects that provide scalable graph layer on existing
hadoop
>> >> echo
>> >> >> >> system.
>> >> >> >> When graph data grows to trillion edge scale, the process
of
>> >> >> >> traversing takes a long time and costly. However, with
the
>> benefit of
>> >> >> >> HBase`s scalable architecture, S2Graph can traverse large
graph in
>> >> >> >> breadth first search manner efficiently.
>> >> >> >>
>> >> >> >> S2Graph also interoperates with several existing Apache
>> >> >> >> projects(HBase, Spark) to provide way to merge real time
events
>> and
>> >> >> >> batch processed data using property graph data model.
>> >> >> >>
>> >> >> >> Many developers are running their own domain specific
API servers
>> to
>> >> >> >> serve their data products, but graph model is general
and S2Graph
>> API
>> >> >> >> fully support traverse on graph, so it can be used as
scalable
>> >> general
>> >> >> >> purpose API serving layer for various domains.
>> >> >> >> As long as data can be modeled as graph, then users can
avoid
>> tedious
>> >> >> >> work for developing customized API servers by using S2Graph.
>> >> >> >>
>> >> >> >> == Initial Goals ==
>> >> >> >> The initial goals will be to move the existing codebase
to Apache
>> and
>> >> >> >> integrate with the Apache development process. Once this
is
>> >> >> >> accomplished, we plan for incremental development and
releases
>> that
>> >> >> >> follow the Apache guidelines.
>> >> >> >>
>> >> >> >> == Current Status ==
>> >> >> >>
>> >> >> >> === Meritocracy ===
>> >> >> >> S2Graph operated on meritocratic principles from the get
go.
>> >> >> >> Currently, all the discussions pertaining to S2Graph development
>> are
>> >> >> >> public on Github. The current incubation
>> >> >> >> proposal includes the major code contributors to S2Graph.
Several
>> >> >> >> additional people have worked on the S2graph codebase
for industry
>> >> use
>> >> >> >> cases and would be interested in becoming committers.
We are
>> starting
>> >> >> >> with a small committer group and we plan to add additional
>> committers
>> >> >> >> following an open merit-based decision process during
the
>> incubation
>> >> >> >> phase.
>> >> >> >>
>> >> >> >> === Community ===
>> >> >> >> We have already begun building a community but at this
time the
>> >> >> >> community consists only of S2Graph developers – all
Kakao
>> employees –
>> >> >> >> and prospective users.
>> >> >> >> S2Graph seeks to develop developer and user communities
during
>> >> >> incubation.
>> >> >> >>
>> >> >> >> === Core Developers ===
>> >> >> >> S2Graph is currently being designed and developed by 2
engineers
>> from
>> >> >> >> Kakao. - Doyung Yoon, Deawon Jeong.
>> >> >> >>
>> >> >> >> === Alignment ===
>> >> >> >> Our proposed S2Graph effort aligns closely with Apache
HBase. The
>> >> >> >> HBase project perimeter is denoted by a simple byte-array
based
>> >> >> >> Create, Read, Update, Delete and Scan APIs with no current
plans
>> to
>> >> >> >> extend beyond this bounds.
>> >> >> >>
>> >> >> >> S2Graph complements this with a higher level API for property
>> graph
>> >> >> model.
>> >> >> >>
>> >> >> >> S2Graph was designed to offer scalable distributed graph
database
>> >> skin
>> >> >> >> over HBase from the beginning in order to provide property
graph
>> >> model
>> >> >> >> and breadth first search, and continue to focus on providing
graph
>> >> >> >> model.
>> >> >> >>
>> >> >> >> == Known Risks ==
>> >> >> >> === Orphaned Products ===
>> >> >> >> The core developers of S2Graph team plan to work full
time on this
>> >> >> >> project. There is very little risk of S2Graph getting
orphaned
>> since
>> >> >> >> at least one large company (Kakao) is extensively using
it in
>> their
>> >> >> >> production HBase clusters. For example, currently there
are 20+
>> use
>> >> >> >> cases with more than 1+Trillion edges and 140 million
breadth
>> first
>> >> >> >> search query requests per minute using S2Graph in production.
>> >> >> >> We plan to extend and diversify this community further
through
>> >> Apache.
>> >> >> >>
>> >> >> >> === Inexperience with Open Source ===
>> >> >> >> The core developers are all active users and followers
of open
>> >> source.
>> >> >> >> They are already committers and contributors to the S2Graph
Github
>> >> >> >> project. All have been involved with the source code that
has been
>> >> >> >> released under an open source license. Though the core
set of
>> >> >> >> Developers do not have Apache Open Source experience,
there are
>> plans
>> >> >> >> to onboard individuals with Apache open source experience
on to
>> the
>> >> >> >> project.
>> >> >> >>
>> >> >> >> === Homogenous Developers ===
>> >> >> >> Most committers in this proposal belong to the same institution
>> >> >> >> (Kakao). The engagement of these committers goes well
beyond the
>> >> >> >> necessary development to support research, and all committers
>> work on
>> >> >> >> S2Graph full time.
>> >> >> >> Several people from other institutions are working on
and are
>> >> familiar
>> >> >> >> with the S2Graph codebase. We will work to attract them
as future
>> >> >> >> committers during the incubation phase, following a merit-based
>> >> >> >> approach.
>> >> >> >>
>> >> >> >> === Reliance on Salaried Developers ===
>> >> >> >> Kakao invested in S2Graph as the distributed graph database
>> solution
>> >> >> >> on top of HBase and some of its key engineers are working
full
>> time
>> >> on
>> >> >> >> the project.
>> >> >> >> We look forward to other Apache developers and researchers
to
>> >> >> >> contribute to the project.
>> >> >> >> Also key to addressing the risk associated with relying
on
>> Salaried
>> >> >> >> developers from a single entity is to increase the diversity
of
>> the
>> >> >> >> contributors and actively lobby for Domain experts in
the graph
>> >> >> >> database space to contribute. Apache S2Graph intends to
do this.
>> >> >> >>
>> >> >> >> === Relationships with Other Apache Products ===
>> >> >> >> S2Graph has a strong relationship and dependency with
Apache
>> Hadoop
>> >> >> >> HBase and Spark.
>> >> >> >> Being part of Apache’s Incubation community, could help
with a
>> closer
>> >> >> >> collaboration among these two projects and as well as
others.
>> >> >> >>
>> >> >> >> In terms of graph processing frameworks, S2Graph and Apache
Giraph
>> >> >> >> look similar. However, their goals are apparently different
to
>> each
>> >> >> >> other. Giraph aims at analytical batch processing on immutable
>> graph
>> >> >> >> data sets. In contrast, S2Graph is designed for OLTP-like
>> workloads
>> >> on
>> >> >> >> graph data sets, and S2Graph provides INSERT/UPDATE operations
>> too.
>> >> >> >>
>> >> >> >>
>> >> >> >> === An Excessive Fascination with the Apache Brand ===
>> >> >> >> S2Graph is proposing to enter incubation at Apache in
order to
>> help
>> >> >> >> efforts to diversify the committer-base, not so much to
>> capitalize on
>> >> >> >> the Apache brand. The S2Graph project is in production
use already
>> >> >> >> inside Kakao, but is not expected to be an Kakao product
for
>> external
>> >> >> >> customers. As such, the S2Graph project is not seeking
to use the
>> >> >> >> Apache brand as a marketing tool.
>> >> >> >>
>> >> >> >> == Documentation ==
>> >> >> >> Information about S2Graph can be found at
>> >> >> >> https://github.com/kakao/s2graph. The following links
provide
>> more
>> >> >> >> information about S2Graph in open source:
>> >> >> >>  * S2Graph web site:
>> >> >> https://steamshon.gitbooks.io/s2graph-book/content/
>> >> >> >>  * Codebase at Github: https://github.com/kakao/s2graph
>> >> >> >>  * Issue Tracking: https://github.com/kakao/s2graph/issues
>> >> >> >>  * User community:
>> https://groups.google.com/forum/#!forum/s2graph
>> >> >> >>
>> >> >> >> == Initial Source ==
>> >> >> >>
>> >> >> >> The S2Graph codebase is currently hosted on Github:
>> >> >> >> https://github.com/kakao/s2graph
>> >> >> >>
>> >> >> >> === Source and Intellectual Property Submission Plan ===
>> >> >> >>
>> >> >> >> Currently, the S2Graph codebase is distributed under the
Apache
>> 2.0
>> >> >> >> License.
>> >> >> >>
>> >> >> >> == External Dependencies ==
>> >> >> >>
>> >> >> >> Beyond relying on Apache HBase, Phoenix has the following
external
>> >> >> >> dependencies:
>> >> >> >>  * Asynchbase (BSD license: http://www.antlr3.org/license.html)
>> >> >> >>  * Mysql (BSD license:
>> >> >> >> https://github.com/julianhyde/sqlline/blob/master/LICENSE)
>> >> >> >>  * Play Framework (Apache 2.0 license:
>> >> >> >> https://github.com/playframework/playframework)
>> >> >> >>  * Scala (https://github.com/scala/scala)
>> >> >> >>  * Spark
>> >> >> >>  * Kafka
>> >> >> >>
>> >> >> >> == Required Resources ==
>> >> >> >>
>> >> >> >> === Mailing list ===
>> >> >> >>
>> >> >> >> We will migrate our mailing lists to the following:
>> >> >> >>  * users@s2graph.incubator.apache.org
>> >> >> >>  * dev@s2graph.incubator.apache.org
>> >> >> >>  * private@s2graph.incubator.apache.org
>> >> >> >>  * commits@s2graph.incubator.apache.org
>> >> >> >>
>> >> >> >> === Source control ===
>> >> >> >>
>> >> >> >> The S2Graph team would like to use Git for source control,
due to
>> our
>> >> >> >> current use of Git. We request a writeable Git repo for
S2Graph,
>> and
>> >> >> >> mirroring to be set up to Github through INFRA.
>> >> >> >>
>> >> >> >> === Issue Tracking ===
>> >> >> >>
>> >> >> >> S2Graph currently uses the github issue tracking system
associated
>> >> >> >> with its github repo: https://github.com/kakao/s2graph/issues.
We
>> >> will
>> >> >> >> migrate to the Apache JIRA:
>> >> >> >> http://issues.apache.org/jira/browse/S2Graph
>> >> >> >>
>> >> >> >> === Other Resources ===
>> >> >> >>
>> >> >> >> Jenkins/Hudson for builds and test running.
>> >> >> >> Wiki for documentation purposes
>> >> >> >> Blog to improve project dissemination
>> >> >> >>
>> >> >> >> == Initial Committers ==
>> >> >> >>
>> >> >> >>  * Doyung Yoon <shom83 at gmail.com>
>> >> >> >>  * Daewon Jeong <blueiur at gmail.com>
>> >> >> >>  * Jaesang Kim <honeysleep at gmail.com>
>> >> >> >>  * Hwansung Yu <deejayfwan at gmail.com>
>> >> >> >>  * Min-Seok Kim <mskim.org at gmail.com>
>> >> >> >>  * Chul Kang <miralchul at gmail.com>
>> >> >> >>
>> >> >> >> == Affiliations ==
>> >> >> >>
>> >> >> >> The initial committers are from one organizations: Kakao.
>> >> >> >>  * Doyung Yoon, Kakao
>> >> >> >>  * Daewon Jeong, Kakao
>> >> >> >>  * Jaesang Kim, Kakao
>> >> >> >>  * Hwansung Yu, Kakao
>> >> >> >>  * Min-Seok Kim, Kakao
>> >> >> >>  * Chul Kang, Kakao
>> >> >> >>
>> >> >> >> == Sponsors ==
>> >> >> >>
>> >> >> >> === Champion ===
>> >> >> >> Hyunsik Choi
>> >> >> >>
>> >> >> >> === Nominated Mentors ===
>> >> >> >>
>> >> >> >> === Sponsoring Entity ===
>> >> >> >>
>> >> >> >>  * The Apache Incubator
>> >> >> >>
>> >> >> >> On Fri, Nov 6, 2015 at 4:05 PM, Hyunsik Choi <hyunsik@apache.org>
>> >> >> wrote:
>> >> >> >> > Hi Seetharam,
>> >> >> >> >
>> >> >> >> > Thank you for a good question. That seem to be a
frequent
>> question
>> >> to
>> >> >> >> > this project.
>> >> >> >> >
>> >> >> >> > Here is the answer to your question.
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> https://steamshon.gitbooks.io/s2graph-book/content/what_is_different_to_titan.html
>> >> >> >> >
>> >> >> >> > I hope that this link is helpful to your understanding.
>> >> >> >> >
>> >> >> >> > Best regards,
>> >> >> >> > Hyunsik
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Fri, Nov 6, 2015 at 3:07 PM, Seetharam Venkatesh
>> >> >> >> > <venkatesh@innerzeal.com> wrote:
>> >> >> >> >> Hi Hyunsik,
>> >> >> >> >>
>> >> >> >> >> The proposal looks interesting and want to know
how is this
>> >> different
>> >> >> >> from
>> >> >> >> >> existing solutions in the same space such as
Titan, etc.
>> >> >> >> >>
>> >> >> >> >> Thanks!
>> >> >> >> >> Venkatesh
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> On Fri, Nov 6, 2015 at 1:36 PM Hyunsik Choi <
>> hyunsik@apache.org>
>> >> >> wrote:
>> >> >> >> >>
>> >> >> >> >>> Hi folks,
>> >> >> >> >>>
>> >> >> >> >>> We would like to start a discussion on S2Graph
as an
>> incubation
>> >> >> >> project.
>> >> >> >> >>>
>> >> >> >> >>> S2Graph is a distributed and scalable OLTP
graph database
>> built
>> >> on
>> >> >> >> >>> HBase. It provides interactive queries for
>> >> vertex/edge/sub-graphs on
>> >> >> >> >>> extremely large graph data sets as well as
insertion and
>> update
>> >> >> >> >>> operations.
>> >> >> >> >>>
>> >> >> >> >>> S2Graph was already introduced in Apache
BigData and HBaseCon
>> >> this
>> >> >> >> year.
>> >> >> >> >>>
>> >> >> >> >>> The proposal is available at :
>> >> >> >> >>> https://wiki.apache.org/incubator/S2GraphProposal
>> >> >> >> >>>
>> >> >> >> >>> We are looking forward to any feedback. In
addition, we are
>> >> looking
>> >> >> >> >>> for volunteers as mentors.
>> >> >> >> >>>
>> >> >> >> >>> Best regards,
>> >> >> >> >>> Hyunsik
>> >> >> >> >>>
>> >> >> >> >>>
>> >> >> ---------------------------------------------------------------------
>> >> >> >> >>> To unsubscribe, e-mail:
>> general-unsubscribe@incubator.apache.org
>> >> >> >> >>> For additional commands, e-mail:
>> >> general-help@incubator.apache.org
>> >> >> >> >>>
>> >> >> >> >>>
>> >> >> >>
>> >> >> >>
>> ---------------------------------------------------------------------
>> >> >> >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> >> >> >> For additional commands, e-mail:
>> general-help@incubator.apache.org
>> >> >> >>
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Sergio Fernández
>> >> >> > Partner Technology Manager
>> >> >> > Redlink GmbH
>> >> >> > m: +43 6602747925
>> >> >> > e: sergio.fernandez@redlink.co
>> >> >> > w: http://redlink.co
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> >> >> For additional commands, e-mail: general-help@incubator.apache.org
>> >> >>
>> >> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> >> For additional commands, e-mail: general-help@incubator.apache.org
>> >>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message