From general-return-64964-archive-asf-public=cust-asf.ponee.io@incubator.apache.org Fri Jul 6 03:40:19 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 3C1C9180657 for ; Fri, 6 Jul 2018 03:40:19 +0200 (CEST) Received: (qmail 53299 invoked by uid 500); 6 Jul 2018 01:40:17 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 53288 invoked by uid 99); 6 Jul 2018 01:40:17 -0000 Received: from mail-relay.apache.org (HELO mailrelay1-lw-us.apache.org) (207.244.88.152) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jul 2018 01:40:17 +0000 Received: from mail-qk0-f179.google.com (mail-qk0-f179.google.com [209.85.220.179]) by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id D16FC7D9 for ; Fri, 6 Jul 2018 01:40:16 +0000 (UTC) Received: by mail-qk0-f179.google.com with SMTP id d22-v6so5537079qkc.8 for ; Thu, 05 Jul 2018 18:40:16 -0700 (PDT) X-Gm-Message-State: APt69E1+DB3FjgaLgpj56JRyaSbgJnMR8hQk+HPM1qFUeBXuaK485MwL /oec2WpE5wbUwgB7FoMpII+vI77rGBowp2UN4w== X-Google-Smtp-Source: AAOMgpddAXTyG3n7EK4orMjFMWj4Wi+Q6NwzZu+JQK6qIYf+j47c87jduw+wDxLmNdhsQtUoN0LI8UbZqjSteVtb+Fc= X-Received: by 2002:a37:c249:: with SMTP id j9-v6mr7117110qkm.157.1530841216236; Thu, 05 Jul 2018 18:40:16 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Xin Wang Date: Fri, 6 Jul 2018 09:40:04 +0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [VOTE] Accept Doris into the Apache Incubator To: general@incubator.apache.org Content-Type: multipart/alternative; boundary="000000000000e6c20205704abea6" --000000000000e6c20205704abea6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable +1 Charith Elvitigala =E4=BA=8E2018=E5=B9=B47=E6=9C=886= =E6=97=A5=E5=91=A8=E4=BA=94 =E4=B8=8A=E5=8D=889:37=E5=86=99=E9=81=93=EF=BC= =9A > +1 > > On Fri, 6 Jul 2018 at 06:21, Tan,Zhongyi wrote: > > > +1 =EF=BC=88no binding=EF=BC=89 > > > > =E5=8F=91=E4=BB=B6=E4=BA=BA: Dave Fisher > > > =E7=AD=94=E5=A4=8D: > > > =E6=97=A5=E6=9C=9F: 2018=E5=B9=B47=E6=9C=886=E6=97=A5 =E6=98=9F=E6=9C= =9F=E4=BA=94 =E4=B8=8A=E5=8D=883:22 > > =E8=87=B3: > > > =E4=B8=BB=E9=A2=98: [VOTE] Accept Doris into the Apache Incubator > > > > Hi All, > > > > I would like to start a VOTE to bring the Doris project as an Apache > > incubator podling. > > > > The ASF voting rules are described: > > > > https://www.apache.org/foundation/voting.html > > > > A vote for accepting a new Apache Incubator podling is a majority vote > for > > which only Incubator PMC member votes are binding. > > > > This vote will run for at least 72 hours. Please VOTE as follows > > [] +1 Accept Doris into the Apache Incubator > > [] +0 Abstain. > > [] -1 Do not accept Doris into the Apache Incubator because ... > > > > The proposal is listed below, but you can also access it on the wiki: > > > > https://wiki.apache.org/incubator/DorisProposal > > > > Best regards, > > Dave > > > > =3D Apache Doris =3D > > > > =3D=3D Abstract =3D=3D > > > > Doris is a MPP-based interactive SQL data warehousing for reporting and > > analysis. > > > > =3D=3D Proposal =3D=3D > > > > We propose to contribute the Doris codebase and associated artifacts > (e.g. > > documentation, web-site content etc.) to the Apache Software Foundation= , > > and aim to build an open community around Doris=E2=80=99s continued dev= elopment > in > > the =E2=80=98Apache Way=E2=80=99. > > > > =3D=3D=3D Overview of Doris =3D=3D=3D > > > > Doris=E2=80=99s implementation consists of two daemons: Frontend (FE) a= nd Backend > > (BE). > > > > **Frontend daemon** consists of query coordinator and catalog manager. > > Query coordinator is responsible for receiving users=E2=80=99 sql queri= es, > > compiling queries and managing queries execution. Catalog manager is > > responsible for managing metadata such as databases, tables, partitions= , > > replicas and etc. Several frontend daemons could be deployed to guarant= ee > > fault-tolerance, and load balancing. > > > > **Backend daemon** stores the data and executes the query fragments. Ma= ny > > backend daemons could also be deployed to provide scalability and > > fault-tolerance. > > > > A typical Doris cluster generally composes of several frontend daemons > and > > dozens to hundreds of backend daemons. > > > > Users can use MySQL client tools to connect any frontend daemon to subm= it > > SQL query. Frontend receives the query and compiles it into query plans > > executable by the Backend. Then Frontend sends the query plan fragments > to > > Backend. Backend will build a query execution DAG. Data is fetched and > > pipelined into the DAG. The final result response is sent to client via > > Frontend. The distribution of query fragment execution takes minimizing > > data movement and maximizing scan locality as the main goal. > > > > =3D=3D Background =3D=3D > > > > At Baidu, Prior to Doris, different tools were deployed to solve divers= e > > requirements in many ways. And when a use case requires the simultaneou= s > > availability of capabilities that cannot all be provided by a single > tool, > > users were forced to build hybrid architectures that stitch multiple > tools > > together, but we believe that they shouldn=E2=80=99t need to accept suc= h inherent > > complexity. A storage system built to provide great performance across = a > > broad range of workloads provides a more elegant solution to the proble= ms > > that hybrid architectures aim to solve. Doris is the solution. > > > > Doris is designed to be a simple and single tightly coupled system, not > > depending on other systems. Doris provides high concurrent low latency > > point query performance, but also provides high throughput queries of > > ad-hoc analysis. Doris provides bulk-batch data loading, but also > provides > > near real-time mini-batch data loading. Doris also provides high > > availability, reliability, fault tolerance, and scalability. > > > > =3D=3D Rationale =3D=3D > > > > Doris mainly integrates the technology of Google Mesa and Apache Impala= . > > > > Mesa is a highly scalable analytic data storage system that stores > > critical measurement data related to Google's Internet advertising > > business. Mesa is designed to satisfy complex and challenging set of > users=E2=80=99 > > and systems=E2=80=99 requirements, including near real-time data ingest= ion and > > query ability, as well as high availability, reliability, fault > tolerance, > > and scalability for large data and query volumes. > > > > Impala is a modern, open-source MPP SQL engine architected from the > ground > > up for the Hadoop data processing environment. At present, by virtue of > its > > superior performance and rich functionality=EF=BC=8C Impala has been co= mparable > to > > many commercial MPP database query engine. Mesa can satisfy the needs o= f > > many of our storage requirements, however Mesa itself does not provide = a > > SQL query engine; Impala is a very good MPP SQL query engine, but the > lack > > of a perfect distributed storage engine. So in the end we chose the > > combination of these two technologies. > > > > Learning from Mesa=E2=80=99s data model, we developed a distributed sto= rage > > engine. Unlike Mesa, this storage engine does not rely on any distribut= ed > > file system. Then we deeply integrate this storage engine with Impala > query > > engine. Query compiling, query execution coordination and catalog > > management of storage engine are integrated to be frontend daemon; quer= y > > execution and data storage are integrated to be backend daemon. With th= is > > integration, we implemented a single, full-featured, high performance > state > > the art of MPP database, as well as maintaining the simplicity. > > > > =3D=3D Current Status =3D=3D > > > > Doris has been an open source project on GitHub ( > > https://github.com/baidu/palo). > > > > =3D=3D=3D Meritocracy =3D=3D=3D > > > > Doris has been deployed in production at Baidu and is applying more tha= n > > 200 lines of business. It has demonstrated great performance benefits a= nd > > has proved to be a better way for reporting and analysis based big data= . > > Still We look forward to growing a rich user and developer community. > > > > =3D=3D=3D Community =3D=3D=3D > > > > Doris seeks to develop developer and user communities during incubation= . > > > > Doris makes use of Apache Impala. It was identified during early review > of > > the proposal that the Doris community will need to work with Impala to > > define a suitable API. > > > > =3D=3D=3D Core Developers =3D=3D=3D > > > > * Ruyue Ma (https://github.com/maruyue, maruyue@baidu dot com) > > * Chun Zhao (https://github.com/imay, buaa.zhaoc@gmail dot com) > > * Mingyu Chen (https://github.com/morningman,chenmingyu@baidu dot com) > > * De Li=EF=BC=88https://github.com/lide-reed, mailtolide@sina dot com= =EF=BC=89 > > * Hao Chen (https://github.com/chenhao7253886, chenhao16@baidu dot com= ) > > * Chaoyong Li (https://github.com/cyongli, lichaoyong@baidu dot com) > > * Bin Lin (https://github.com/lingbin, lingbinlb@gmail dot com) > > > > =3D=3D=3D Alignment =3D=3D=3D > > > > Doris is related to several other Apache projects: > > > > * Doris can also read data stored in Apache Hadoop clusters powered by > > the HDFS filesystem. > > * Doris is closely integrated with Impala, which has graduated from > > Apache Incubator. > > * Doris uses Apache Thrift as its RPC and serialization framework of > > choice. > > > > =3D=3D Known Risks =3D=3D > > > > =3D=3D=3D Orphaned Products =3D=3D=3D > > > > The core developers of Doris team plan to work full time on this projec= t. > > There is very little risk of Doris getting orphaned since at least one > > large company (Baidu) is extensively using it in their production. For > > example, currently there are more than 200 use cases using Doris in > > production. Furthermore, since Doris was open sourced at the beginning = of > > October 2017, it has received more than 660 stars and been forked nearl= y > > 170 times. We plan to extend and diversify this community further throu= gh > > Apache. > > > > =3D=3D=3D Inexperience with Open Source =3D=3D=3D > > > > The core developers are all active users and followers of open source. > > They are already committers and contributors to the Doris Github projec= t. > > All have been involved with the source code that has been released unde= r > an > > open source license, and several of them also have experience developin= g > > code in an open source environment. Though the core set of Developers d= o > > not have Apache Open Source experience, there are plans to onboard > > individuals with Apache open source experience on to the project. > > > > =3D=3D=3D Homogenous Developers =3D=3D=3D > > > > The most of core developers are from Baidu, but after Doris was open > > sourced, Doris received a lot of bug fixes and enhancements from other > > developers not working at Baidu. > > > > =3D=3D=3D Reliance on Salaried Developers =3D=3D=3D > > > > Baidu invested in Doris as the OLAP solution and some of its key > engineers > > are working full time on the project. In addition, since there is a > growing > > Big Data need for scalable OLAP solutions, we look forward to other > Apache > > developers and researchers to contribute to the project. Also key to > > addressing the risk associated with relying on Salaried developers from= a > > single entity is to increase the diversity of the contributors and > actively > > lobby for Domain experts in the BI space to contribute. Apache Doris > > intends to do this. > > > > =3D=3D=3D An Excessive Fascination with the Apache Brand =3D=3D=3D > > > > Doris is proposing to enter incubation at Apache in order to help effor= ts > > to diversify the committer-base, not so much to capitalize on the Apach= e > > brand. The Doris project is in production use already inside Baidu, but > is > > not expected to be an Baidu product for external customers. As such, th= e > > Doris project is not seeking to use the Apache brand as a marketing too= l. > > > > =3D=3D Documentation =3D=3D > > > > Information about Doris can be found at https://github.com/baidu/palo. > > The following links provide more information about Doris in open source= : > > > > * Doris wiki site: https://github.com/baidu/palo/wiki > > * Codebase at Github: https://github.com/baidu/palo > > * Issue Tracking: https://github.com/baidu/palo/issues > > * Overview: https://github.com/baidu/Doris/wiki/palo-Overview > > * FAQ: https://github.com/baidu/palo/wiki/palo-FAQ > > > > =3D=3D Initial Source =3D=3D > > > > Doris has been under development since 2017 by a team of engineers at > > Baidu Inc. It is currently hosted on Github.com unde= r > > an Apache license at https://github.com/baidu/palo. > > > > =3D=3D External Dependencies =3D=3D > > > > Doris has the following external dependencies. > > > > * Google gflags (BSD) > > * Google glog (BSD) > > * Apache Thrift (Apache Software License v2.0) > > * Apache Commons (Apache Software License v2.0) > > * Boost (Boost Software License) > > * rapidjson (Tencent) > > * Google RE2 (BSD-style) > > * lz4 (BSD) > > * snappy (BSD) > > * Twitter Bootstrap (Apache Software License v2.0) > > * d3 (BSD) > > * LLVM (BSD-like) > > > > Build and test dependencies: > > > > * Apache Ant (Apache Software License v2.0) > > * Apache Maven (Apache Software License v2.0) > > * cmake (BSD) > > * clang (BSD) > > * Google gtest (Apache Software License v2.0) > > > > =3D=3D Required Resources =3D=3D > > > > =3D=3D=3D Mailing List =3D=3D=3D > > > > There are currently no mailing lists. The usual mailing lists are > expected > > to be set up when entering incubation: > > > > * private@doris.incubator.apache.org > private@doris.incubator.apache.org> > > * dev@doris.incubator.apache.org > > * commits@doris.incubator.apache.org > commits@doris.incubator.apache.org> > > > > =3D=3D=3D Subversion Directory =3D=3D=3D > > > > Upon entering incubation, we want to move (or copy) the existing repo > from > > https://github.com/baidu/palo to Apache infrastructure at > > https://github.com/apache/incubator-doris. > > > > =3D=3D=3D Issue Tracking =3D=3D=3D > > > > Doris currently uses GitHub to track issues. Would like to continue to = do > > so while we discuss migration possibilities with the ASF Infra committe= e. > > > > =3D=3D=3D Other Resources =3D=3D=3D > > > > The existing code already has unit tests so we will make use of existin= g > > Apache continuous testing infrastructure. The resulting load should not > be > > very large. > > > > =3D=3D Initial Committers =3D=3D > > > > * Ruyue Ma (https://github.com/maruyue, maruyue@baidu dot com) > > * Chun Zhao (https://github.com/imay, buaa.zhaoc@gmail dot com) > > * Mingyu Chen (https://github.com/morningman,chenmingyu@baidu dot com) > > * De Li=EF=BC=88https://github.com/lide-reed, mailtolide@sina dot com= =EF=BC=89 > > * Hao Chen (https://github.com/chenhao7253886, chenhao16@baidu dot com= ) > > * Chaoyong Li (https://github.com/cyongli, lichaoyong@baidu dot com) > > * Bin Lin (https://github.com/lingbin, lingbinlb@gmail dot com) > > * Sijie Guo (guosijie@gmail dot com) > > * Zheng Shao (zshao@apache.org) > > > > =3D=3D Affiliations =3D=3D > > > > The initial committers are employees of Baidu Inc.. > > > > =3D=3D Sponsors =3D=3D > > > > =3D=3D=3D Champion =3D=3D=3D > > > > * Dave Fisher, wave@apache.org > > > > =3D=3D=3D Nominated Mentors =3D=3D=3D > > > > * Luke Han, lukehan@apache.org > > * Dave Fisher, wave@apache.org > > * Willem Jiang, ningjiang@apache.org > > > > =3D=3D=3D Sponsoring Entity =3D=3D=3D > > > > We are requesting the Incubator to sponsor this project. > > > > -- > > Charitha Elvitigala > > > --000000000000e6c20205704abea6--