incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eitan Adler <li...@eitanadler.com>
Subject Re: [DISCUSS] DistributedLog Incubation Proposal
Date Sat, 11 Jun 2016 16:20:18 GMT
+ some people explicitly

On 10 June 2016 at 12:42, Sravya Tirukkovalur <sravya@apache.org> wrote:
> Excited to see DistributedLog come to ASF!
>
> I see that you already have good list of nominated mentors. As a member of
> recently graduated project, I can offer mentorship(informal) as well if
> needed. I am not an IPMC member, so I guess I cannot be a formal mentor.
>
> Regards,
>
> On Wed, Jun 8, 2016 at 9:34 PM, Sijie Guo <sijie@apache.org> wrote:
>
>> Hi,
>>
>> I would like to propose DistributedLog to be an Apache Incubator project.
>>
>> DistributedLog is a high performance replicated log service.
>> It offers durability, replication and strong consistency, which provides
>> a fundamental building block for building reliable distributed systems,
>> e.g replicated-state-machines, general pub/sub systems, distributed
>> databases, distributed queues and etc.
>>
>> Here's a link to the proposal in the Incubator wiki
>>
>> https://wiki.apache.org/incubator/DistributedLogProposal
>>
>> I've also pasted the initial contents below.
>>
>> Thanks,
>>
>> Sijie
>>
>> = Abstract =
>> DistributedLog is a high-performance replicated log service. It offers
>> durability, replication and strong consistency, which provides a
>> fundamental building block for building reliable distributed systems,
>> e.g replicated-state-machines, general pub/sub systems, distributed
>> databases, distributed queues and etc.
>>
>> See “Building Distributedlog - Twitter’s high performance replicated
>> log service” for details:
>>
>> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
>>
>> = Proposal =
>> We propose to contribute DistributedLog codebase and associated
>> artifacts (e.g. documentation, web-site content etc.) to the Apache
>> Software Foundation with the intent of forming a productive,
>> meritocratic and open community around DistributedLog’s continued
>> development, according to the ‘Apache Way’.
>>
>> = Background =
>> Engineers at Twitter began developing DistributedLog in early 2013.
>> DistributedLog is described in a Twitter engineering blog post and
>> presented at the Messaging Meetup in Sep 2015. It has been released as
>> an Apache-licensed open-source project on GitHub in May 2016.
>>
>> DistributedLog is a high-performance replicated log service, which
>> provides simple stream-oriented abstractions over log-segments and
>> offers durability, replication and strong consistency for building
>> reliable distributed systems. The features offered by DistributedLog
>> includes:
>>  * Simple high-level, stream oriented interface
>>  * Naming and metadata scheme for managing streams and other entities
>>  * Log data management policies, include data segmentation and data
>> retention
>>  * Fast write pipeline leveraging batching and compression
>>  * Fast read mechanism leveraging long-poll and read-ahead caching
>>  * Service tiers supporting writer fan-in and reader fan-out
>>  * Geo-replicated logs
>>
>> DistributedLog’s most important benefit is high-performance with a
>> strong durability guarantee, making it extremely appropriate for
>> running different workloads from distributed database journaling to
>> real-time stream computing. Its modern, layered architecture makes it
>> easy to run the service tiers in multi-tenant datacenter environments
>> such as Apache Mesos or cloud environments such as EC2.
>>
>> = Rationale =
>> DistributedLog is designed to provide core fundamental features like
>> high-performance, durability and strong consistency to anyone who is
>> building reliable distributed systems, in a simple and efficient way.
>>
>> We believe that the ASF is the right venue to foster an open-source
>> community around DistributedLog’s development. We expect that
>> DistributedLog will benefit from collaboration with related Apache
>> projects, and under the auspices of the ASF will attract talented
>> contributors who will push DistributedLog’s development forward at a
>> faster pace.
>>
>> We believe that the timing is right for DistributedLog’s development
>> to move to the ASF: DistributedLog has already run in production at
>> Twitter for 3 years and served various workloads including a
>> distributed database journal, reliable cross datacenter replication,
>> search ingestion, andgeneral pub/sub messaging. The project is stable.
>> We are excited to see where an ASF-based community can take
>> DistributedLog.
>>
>> = Current Status =
>> DistributedLog is a stable project that has been used in production at
>> Twitter for 3 years. The source code is public at github.com/twitter,
>> which will seed the Apache git repository.
>>
>> = Meritocracy =
>> We understand the central importance of meritocracy to the Apache Way.
>> We will work to establish a welcoming, fair and meritocratic
>> community. Several companies have already expressed interest in this
>> project, and we intend to invite additional developers to participate.
>> We look forward to growing a rich user and developer community.
>>
>> = Community =
>> There is a large need for a performant replicated log service for
>> applications such as distributed databases, distributed transactional
>> systems, replicated-state-machines and pub/sub messaging/queuing. We
>> want to attract more developers to the project, and we believe that
>> the ASF’s open and meritocratic philosophy will help us with this. We
>> note the success of other similar projects already part of the ASF,
>> like Kafka.
>>
>> = Core Developers =
>> DistributedLog is actively developed within Twitter. Most of the
>> developers are from Twitter. Many of them are committers or PMC
>> members of Apache BookKeeper. Others aren’t currently affiliated with
>> ASF so they will require new ICLAs.
>>
>> = Alignment =
>> DistributedLog is related to several other Apache projects:
>>  * DistributedLog stores log segments as Ledgers in Apache BookKeeper.
>>  * DistributedLog uses Apache ZooKeeper for naming and metadata
>> management and tracking the ownership of logs.
>>  * DistributedLog uses Apache Thrift as its RPC and serialization
>> framework.
>>  * In the long-term, DistributedLog’s data will be stored in Apache
>> Hadoop clusters powered by HDFS filesystem for archives and backup.
>>
>> = Known Risks =
>>
>> == Orphaned Products ==
>> DistributedLog is used as the fundamental messaging infrastructure at
>> Twitter. It has been serving production traffic for online database
>> systems, search ingestion and a general pub/sub system. Twitter
>> remains committed to developing and supporting the project. Twitter
>> has a strong track record in standing behind projects that were
>> contributed to the ASF by its employees, including Apache Mesos,
>> Apache Aurora, Apache BookKeeper, Apache Hadoop. There are many
>> companies are interested in using it in production.
>>
>> == Inexperience with Open Source ==
>> The core developers of DistributedLog are committers of Apache
>> BookKeeper. Although other committers on the initial list are
>> committers or have less experience with the ASF, they already are
>> active in Apache BookKeeper community. We are confident that the
>> project can be run in accordance with Apache principles on an ongoing
>> basis.
>>
>> == Homogeneous Developers ==
>> The initial committers are from Twitter. We hope to encourage
>> contributions from other developers and grow them into committers
>> after they have had time to continue their contributions.
>>
>> == Reliance on Salaried Developers ==
>> Many of DistributedLog’s initial set of committers work full-time on
>> DistributedLog, and are paid to do so. However, as mentioned
>> elsewhere, we anticipate growth in the developer community which we
>> hope will include people from industry, hobbyists, and academics who
>> have an interested in distributed messaging systems.
>>
>> == Relationships with Other Apache Products ==
>> DistributedLog uses Apache BookKeeper to store log segments and Apache
>> ZooKeeper to store log metadata and manage log namespaces. It provides
>> an end-to-end solution for replicated logs, to make building reliable
>> distributed systems much easier. Unlike Kafka or ActiveMQ,
>> DistributedLog is not a full-fledged pub/sub, queuing or messaging
>> system.  Instead, it is targeting on providing a fundamental building
>> block for other distributed systems, offering durability, replication
>> and consistency. So it could be used by other distributed systems,
>> such as transaction log for replicated state machines (e.g., HDFS
>> NameNode), WAL for distributed databases (e.g. HBase), Journal for
>> in-memory services (e.g., Kestrel) and even storage backend for a
>> full-fledged messaging system.
>>
>> == An Excessive Fascination with the Apache Brand ==
>> DistributedLog builds on two existing top-level projects, Apache
>> BookKeeper and Apache ZooKeeper. Some of the core developers actively
>> participate in both projects and understand well the implications of
>> being hosted by Apache. We would like this project to build on the
>> same core values of ASF and to grow a community based on meritocracy.
>> Also, there are several other projects already hosted by ASF in this
>> space of reliable messaging and that overlap with DistributedLog in
>> interests and scope. Consequently, the combination of all these
>> observations makes us believe that DistributedLog should be hosted by
>> the ASF.
>>
>> = Documentation =
>> Building DistributedLog: Twitter’s high performance replicated log
>> service (
>> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
>> )
>>
>> Documentation located in http://distributedlog.io.
>>
>> = Initial Source =
>> DistributedLog’s initial source contribution will come from
>> http://github.com/twitter/distributedlog/.
>>
>> = External Dependencies =
>> DistributedLog depends upon a number of third-party libraries, which
>> we list below.
>>  * Apache BookKeeper (Apache Software License v2.0)
>>  * Apache Commons (Apache Software License v2.0)
>>  * Apache Maven (Apache Software License v2.0)
>>  * Apache Thrift (Apache Software License v2.0)
>>  * Apache ZooKeeper (Apache Software License v2.0)
>>  * Google Guava (Apache Software License v2.0)
>>  * Mockito (MIT License)
>>  * Junit (Eclipse Public License 1.0)
>>  * LZ4-java (Apache Software License v2.0)
>>  * SLF4J (MIT License)
>>  * Twitter Finagle (Apache Software License v2.0)
>>  * Twitter Scrooge (Apache Software License v2.0)
>>  * Twitter Util (Apache Software License v2.0)
>>
>> = Required Resources =
>> We request that following resources be created for the project to use:
>>
>> == Mailing lists ==
>>  * private@distributedlog.incubator.apache.org (moderated subscriptions)
>>  * commits@distributedlog.incubator.apache.org
>>  * dev@distributedlog.incubator.apache.org
>>  * user@distributedlog.incubator.apache.org
>>
>> == Git repository ==
>> https://git.apache.org/distributedlog.git
>>
>> == JIRA instance ==
>> JIRA project DLOG (DLOG or DL)
>>
>> = Initial Committers =
>>  * Sijie Guo (Apache BookKeeper Committer, Twitter)
>>  * Robin Dhamankar (Apache BookKeeper Committer)
>>  * Leigh Stewart (Twitter)
>>  * Dave Rusek (Twitter)
>>  * Honggang Zhang (Twitter)
>>  * Jordan Bull (Twitter)
>>  * Satish Kotha (Twitter)
>>  * Aniruddha Laud
>>  * Franck Cuny (Twitter)
>>  * Eitan Adler (Twitter)
>>
>> == Affiliations ==
>>
>> Most of the initial committers are employees of Twitter, except Robin
>> Dhamankar and Aniruddha Laud.
>>
>> = Sponsors =
>>
>> == Champion ==
>>
>> Flavio Junqueira
>>
>> == Nominated Mentors ==
>>
>>  * Flavio Junqueira
>>  * Chris Nauroth
>>  * Henry Saputra
>>
>> = Sponsoring Entity =
>>
>> We ask that the Apache Incubator PMC to sponsor this proposal.
>>



-- 
Eitan Adler

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message