incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Saputra <hsapu...@apache.org>
Subject Re: [DISCUSS] DistributedLog Incubation Proposal
Date Sun, 12 Jun 2016 03:26:13 GMT
Sravya,

Thank you for the interest and willingness to help.
We definitely looking for more contributors as the projects coming to ASF.

Being mentor is not necessary requirement to help a podling as you probably
already know.

Looking forward to see you in the community :)

- Henry

On Saturday, June 11, 2016, Sravya Tirukkovalur <sravya@apache.org> wrote:

> @Sijie: As I am not an IPMC member, I am not eligible to be a mentor. I am
> figuring out if I can still contribute in some way informally. Will keep
> you posted. So no, I do not think you should add me to the proposal.
>
> Thanks for your interest though!
>
> On Sat, Jun 11, 2016 at 12:38 PM, Sijie Guo <sijie@apache.org
> <javascript:_e(%7B%7D,'cvml','sijie@apache.org');>> wrote:
>
>>
>>
>> Thanks Eitan for adding me.
>>
>> Sravya, cool! I am glad that you are interested in mentoring this
>> project. Shall I add you to the proposal?
>>
>> Sijie
>>
>>
>> On Saturday, June 11, 2016, Eitan Adler <lists@eitanadler.com
>> <javascript:_e(%7B%7D,'cvml','lists@eitanadler.com');>> wrote:
>>
>>> + some people explicitly
>>>
>>> On 10 June 2016 at 12:42, Sravya Tirukkovalur <sravya@apache.org> wrote:
>>> > Excited to see DistributedLog come to ASF!
>>> >
>>> > I see that you already have good list of nominated mentors. As a
>>> member of
>>> > recently graduated project, I can offer mentorship(informal) as well if
>>> > needed. I am not an IPMC member, so I guess I cannot be a formal
>>> mentor.
>>> >
>>> > Regards,
>>> >
>>> > On Wed, Jun 8, 2016 at 9:34 PM, Sijie Guo <sijie@apache.org> wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I would like to propose DistributedLog to be an Apache Incubator
>>> project.
>>> >>
>>> >> DistributedLog is a high performance replicated log service.
>>> >> It offers durability, replication and strong consistency, which
>>> provides
>>> >> a fundamental building block for building reliable distributed
>>> systems,
>>> >> e.g replicated-state-machines, general pub/sub systems, distributed
>>> >> databases, distributed queues and etc.
>>> >>
>>> >> Here's a link to the proposal in the Incubator wiki
>>> >>
>>> >> https://wiki.apache.org/incubator/DistributedLogProposal
>>> >>
>>> >> I've also pasted the initial contents below.
>>> >>
>>> >> Thanks,
>>> >>
>>> >> Sijie
>>> >>
>>> >> = Abstract =
>>> >> DistributedLog is a high-performance replicated log service. It offers
>>> >> durability, replication and strong consistency, which provides a
>>> >> fundamental building block for building reliable distributed systems,
>>> >> e.g replicated-state-machines, general pub/sub systems, distributed
>>> >> databases, distributed queues and etc.
>>> >>
>>> >> See “Building Distributedlog - Twitter’s high performance replicated
>>> >> log service” for details:
>>> >>
>>> >>
>>> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
>>> >>
>>> >> = Proposal =
>>> >> We propose to contribute DistributedLog codebase and associated
>>> >> artifacts (e.g. documentation, web-site content etc.) to the Apache
>>> >> Software Foundation with the intent of forming a productive,
>>> >> meritocratic and open community around DistributedLog’s continued
>>> >> development, according to the ‘Apache Way’.
>>> >>
>>> >> = Background =
>>> >> Engineers at Twitter began developing DistributedLog in early 2013.
>>> >> DistributedLog is described in a Twitter engineering blog post and
>>> >> presented at the Messaging Meetup in Sep 2015. It has been released
as
>>> >> an Apache-licensed open-source project on GitHub in May 2016.
>>> >>
>>> >> DistributedLog is a high-performance replicated log service, which
>>> >> provides simple stream-oriented abstractions over log-segments and
>>> >> offers durability, replication and strong consistency for building
>>> >> reliable distributed systems. The features offered by DistributedLog
>>> >> includes:
>>> >>  * Simple high-level, stream oriented interface
>>> >>  * Naming and metadata scheme for managing streams and other entities
>>> >>  * Log data management policies, include data segmentation and data
>>> >> retention
>>> >>  * Fast write pipeline leveraging batching and compression
>>> >>  * Fast read mechanism leveraging long-poll and read-ahead caching
>>> >>  * Service tiers supporting writer fan-in and reader fan-out
>>> >>  * Geo-replicated logs
>>> >>
>>> >> DistributedLog’s most important benefit is high-performance with a
>>> >> strong durability guarantee, making it extremely appropriate for
>>> >> running different workloads from distributed database journaling to
>>> >> real-time stream computing. Its modern, layered architecture makes it
>>> >> easy to run the service tiers in multi-tenant datacenter environments
>>> >> such as Apache Mesos or cloud environments such as EC2.
>>> >>
>>> >> = Rationale =
>>> >> DistributedLog is designed to provide core fundamental features like
>>> >> high-performance, durability and strong consistency to anyone who is
>>> >> building reliable distributed systems, in a simple and efficient way.
>>> >>
>>> >> We believe that the ASF is the right venue to foster an open-source
>>> >> community around DistributedLog’s development. We expect that
>>> >> DistributedLog will benefit from collaboration with related Apache
>>> >> projects, and under the auspices of the ASF will attract talented
>>> >> contributors who will push DistributedLog’s development forward at
a
>>> >> faster pace.
>>> >>
>>> >> We believe that the timing is right for DistributedLog’s development
>>> >> to move to the ASF: DistributedLog has already run in production at
>>> >> Twitter for 3 years and served various workloads including a
>>> >> distributed database journal, reliable cross datacenter replication,
>>> >> search ingestion, andgeneral pub/sub messaging. The project is stable.
>>> >> We are excited to see where an ASF-based community can take
>>> >> DistributedLog.
>>> >>
>>> >> = Current Status =
>>> >> DistributedLog is a stable project that has been used in production
at
>>> >> Twitter for 3 years. The source code is public at github.com/twitter,
>>> >> which will seed the Apache git repository.
>>> >>
>>> >> = Meritocracy =
>>> >> We understand the central importance of meritocracy to the Apache Way.
>>> >> We will work to establish a welcoming, fair and meritocratic
>>> >> community. Several companies have already expressed interest in this
>>> >> project, and we intend to invite additional developers to participate.
>>> >> We look forward to growing a rich user and developer community.
>>> >>
>>> >> = Community =
>>> >> There is a large need for a performant replicated log service for
>>> >> applications such as distributed databases, distributed transactional
>>> >> systems, replicated-state-machines and pub/sub messaging/queuing. We
>>> >> want to attract more developers to the project, and we believe that
>>> >> the ASF’s open and meritocratic philosophy will help us with this.
We
>>> >> note the success of other similar projects already part of the ASF,
>>> >> like Kafka.
>>> >>
>>> >> = Core Developers =
>>> >> DistributedLog is actively developed within Twitter. Most of the
>>> >> developers are from Twitter. Many of them are committers or PMC
>>> >> members of Apache BookKeeper. Others aren’t currently affiliated with
>>> >> ASF so they will require new ICLAs.
>>> >>
>>> >> = Alignment =
>>> >> DistributedLog is related to several other Apache projects:
>>> >>  * DistributedLog stores log segments as Ledgers in Apache BookKeeper.
>>> >>  * DistributedLog uses Apache ZooKeeper for naming and metadata
>>> >> management and tracking the ownership of logs.
>>> >>  * DistributedLog uses Apache Thrift as its RPC and serialization
>>> >> framework.
>>> >>  * In the long-term, DistributedLog’s data will be stored in Apache
>>> >> Hadoop clusters powered by HDFS filesystem for archives and backup.
>>> >>
>>> >> = Known Risks =
>>> >>
>>> >> == Orphaned Products ==
>>> >> DistributedLog is used as the fundamental messaging infrastructure at
>>> >> Twitter. It has been serving production traffic for online database
>>> >> systems, search ingestion and a general pub/sub system. Twitter
>>> >> remains committed to developing and supporting the project. Twitter
>>> >> has a strong track record in standing behind projects that were
>>> >> contributed to the ASF by its employees, including Apache Mesos,
>>> >> Apache Aurora, Apache BookKeeper, Apache Hadoop. There are many
>>> >> companies are interested in using it in production.
>>> >>
>>> >> == Inexperience with Open Source ==
>>> >> The core developers of DistributedLog are committers of Apache
>>> >> BookKeeper. Although other committers on the initial list are
>>> >> committers or have less experience with the ASF, they already are
>>> >> active in Apache BookKeeper community. We are confident that the
>>> >> project can be run in accordance with Apache principles on an ongoing
>>> >> basis.
>>> >>
>>> >> == Homogeneous Developers ==
>>> >> The initial committers are from Twitter. We hope to encourage
>>> >> contributions from other developers and grow them into committers
>>> >> after they have had time to continue their contributions.
>>> >>
>>> >> == Reliance on Salaried Developers ==
>>> >> Many of DistributedLog’s initial set of committers work full-time
on
>>> >> DistributedLog, and are paid to do so. However, as mentioned
>>> >> elsewhere, we anticipate growth in the developer community which we
>>> >> hope will include people from industry, hobbyists, and academics who
>>> >> have an interested in distributed messaging systems.
>>> >>
>>> >> == Relationships with Other Apache Products ==
>>> >> DistributedLog uses Apache BookKeeper to store log segments and Apache
>>> >> ZooKeeper to store log metadata and manage log namespaces. It provides
>>> >> an end-to-end solution for replicated logs, to make building reliable
>>> >> distributed systems much easier. Unlike Kafka or ActiveMQ,
>>> >> DistributedLog is not a full-fledged pub/sub, queuing or messaging
>>> >> system.  Instead, it is targeting on providing a fundamental building
>>> >> block for other distributed systems, offering durability, replication
>>> >> and consistency. So it could be used by other distributed systems,
>>> >> such as transaction log for replicated state machines (e.g., HDFS
>>> >> NameNode), WAL for distributed databases (e.g. HBase), Journal for
>>> >> in-memory services (e.g., Kestrel) and even storage backend for a
>>> >> full-fledged messaging system.
>>> >>
>>> >> == An Excessive Fascination with the Apache Brand ==
>>> >> DistributedLog builds on two existing top-level projects, Apache
>>> >> BookKeeper and Apache ZooKeeper. Some of the core developers actively
>>> >> participate in both projects and understand well the implications of
>>> >> being hosted by Apache. We would like this project to build on the
>>> >> same core values of ASF and to grow a community based on meritocracy.
>>> >> Also, there are several other projects already hosted by ASF in this
>>> >> space of reliable messaging and that overlap with DistributedLog in
>>> >> interests and scope. Consequently, the combination of all these
>>> >> observations makes us believe that DistributedLog should be hosted by
>>> >> the ASF.
>>> >>
>>> >> = Documentation =
>>> >> Building DistributedLog: Twitter’s high performance replicated log
>>> >> service (
>>> >>
>>> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
>>> >> )
>>> >>
>>> >> Documentation located in http://distributedlog.io.
>>> >>
>>> >> = Initial Source =
>>> >> DistributedLog’s initial source contribution will come from
>>> >> http://github.com/twitter/distributedlog/.
>>> >>
>>> >> = External Dependencies =
>>> >> DistributedLog depends upon a number of third-party libraries, which
>>> >> we list below.
>>> >>  * Apache BookKeeper (Apache Software License v2.0)
>>> >>  * Apache Commons (Apache Software License v2.0)
>>> >>  * Apache Maven (Apache Software License v2.0)
>>> >>  * Apache Thrift (Apache Software License v2.0)
>>> >>  * Apache ZooKeeper (Apache Software License v2.0)
>>> >>  * Google Guava (Apache Software License v2.0)
>>> >>  * Mockito (MIT License)
>>> >>  * Junit (Eclipse Public License 1.0)
>>> >>  * LZ4-java (Apache Software License v2.0)
>>> >>  * SLF4J (MIT License)
>>> >>  * Twitter Finagle (Apache Software License v2.0)
>>> >>  * Twitter Scrooge (Apache Software License v2.0)
>>> >>  * Twitter Util (Apache Software License v2.0)
>>> >>
>>> >> = Required Resources =
>>> >> We request that following resources be created for the project to use:
>>> >>
>>> >> == Mailing lists ==
>>> >>  * private@distributedlog.incubator.apache.org (moderated
>>> subscriptions)
>>> >>  * commits@distributedlog.incubator.apache.org
>>> >>  * dev@distributedlog.incubator.apache.org
>>> >>  * user@distributedlog.incubator.apache.org
>>> >>
>>> >> == Git repository ==
>>> >> https://git.apache.org/distributedlog.git
>>> >>
>>> >> == JIRA instance ==
>>> >> JIRA project DLOG (DLOG or DL)
>>> >>
>>> >> = Initial Committers =
>>> >>  * Sijie Guo (Apache BookKeeper Committer, Twitter)
>>> >>  * Robin Dhamankar (Apache BookKeeper Committer)
>>> >>  * Leigh Stewart (Twitter)
>>> >>  * Dave Rusek (Twitter)
>>> >>  * Honggang Zhang (Twitter)
>>> >>  * Jordan Bull (Twitter)
>>> >>  * Satish Kotha (Twitter)
>>> >>  * Aniruddha Laud
>>> >>  * Franck Cuny (Twitter)
>>> >>  * Eitan Adler (Twitter)
>>> >>
>>> >> == Affiliations ==
>>> >>
>>> >> Most of the initial committers are employees of Twitter, except Robin
>>> >> Dhamankar and Aniruddha Laud.
>>> >>
>>> >> = Sponsors =
>>> >>
>>> >> == Champion ==
>>> >>
>>> >> Flavio Junqueira
>>> >>
>>> >> == Nominated Mentors ==
>>> >>
>>> >>  * Flavio Junqueira
>>> >>  * Chris Nauroth
>>> >>  * Henry Saputra
>>> >>
>>> >> = Sponsoring Entity =
>>> >>
>>> >> We ask that the Apache Incubator PMC to sponsor this proposal.
>>> >>
>>>
>>>
>>>
>>> --
>>> Eitan Adler
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message