Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id F316C200C61 for ; Tue, 25 Apr 2017 22:27:11 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id F190A160BB6; Tue, 25 Apr 2017 20:27:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E6C57160B8E for ; Tue, 25 Apr 2017 22:27:10 +0200 (CEST) Received: (qmail 55476 invoked by uid 500); 25 Apr 2017 20:27:09 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 55465 invoked by uid 99); 25 Apr 2017 20:27:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Apr 2017 20:27:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 1478BC321E for ; Tue, 25 Apr 2017 20:27:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.481 X-Spam-Level: X-Spam-Status: No, score=0.481 tagged_above=-999 required=6.31 tests=[HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id ByQ2GFVnY_FS for ; Tue, 25 Apr 2017 20:27:06 +0000 (UTC) Received: from mail-it0-f51.google.com (mail-it0-f51.google.com [209.85.214.51]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 4829C5FBFA for ; Tue, 25 Apr 2017 20:27:06 +0000 (UTC) Received: by mail-it0-f51.google.com with SMTP id x188so83518865itb.0 for ; Tue, 25 Apr 2017 13:27:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=38APcMMboLzyF8TWDs1Av8oDE882yNWCFpPgNYf9EQk=; b=nab/n1fUDCqSBg7SkT/2Kpr/yBBDKDTeTZyGRTfU4xK5j0ZNDXcRRy0HpVq8BsuGpl BwGf5PM1UeZPqG7awHba9vc/sGwca+Jc0ZROgZzpezVWlvO1u0Oc14wfJpwjGC8PfUYA b80Vsw8KR2Zh898XAkG1/clbwecT2nBMOG2LjyS8KXMoEgUlzjNTFC/WTG9KTwUhVQsH v+XDGdieuwARO+th4/D568JmYWlV1DQ/1hOwh/D1Vh08S0X5XhRKBgIHGYVRTYj4kBAp 9i6E4V/ZKvbUuI2+Y4hDl9+wmIhKr7FJJ4FtK1honFWUuqacKb7oTBwizPcxHDqO4DFh hB5A== X-Gm-Message-State: AN3rC/6Wp/AYPisN10bE5+lhMOL3/A5Hl6I01ZNDaa/3NxWVNq0LLlme UAaJr6CsBUVG1zOFm0Q= X-Received: by 10.36.50.142 with SMTP id j136mr7160651ita.111.1493152025134; Tue, 25 Apr 2017 13:27:05 -0700 (PDT) Received: from [192.168.2.106] (c-50-184-110-23.hsd1.ca.comcast.net. [50.184.110.23]) by smtp.gmail.com with ESMTPSA id e8sm2931218itd.3.2017.04.25.13.27.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 25 Apr 2017 13:27:04 -0700 (PDT) From: Julian Hyde Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: [VOTE] Superset Proposal for Apache Incubator Date: Tue, 25 Apr 2017 13:27:01 -0700 References: To: general@incubator.apache.org In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3273) archived-at: Tue, 25 Apr 2017 20:27:12 -0000 +1 binding > On Apr 25, 2017, at 12:48 PM, moon soo Lee wrote: >=20 > +1 (non-binding) >=20 > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan = > wrote: >=20 >> +1 (binding) >>=20 >> Thanks, >> Ashutosh >>=20 >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han wrote: >>=20 >>> +1 binding >>>=20 >>> Love to see Superset to be new incubator project. >>>=20 >>>=20 >>> Best Regards! >>> --------------------- >>>=20 >>> Luke Han >>>=20 >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng = wrote: >>>=20 >>>> Dear Apache Incubator Community, >>>>=20 >>>> We have updated the Superset proposal >>>> (copied below) = for >>>>=20 >>>> Apache Incubation with an additional mentor (Luke Han - >>>> luke.han@apache.org), >>>> and would like to start a vote thread for acceptance into the = incubator. >>>>=20 >>>> Our team is excited to share Superset with the Apache community and = we >>>> hope >>>> for the your continued support! >>>>=20 >>>> Cheers, >>>> Jeff & the Superset Team >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>> =3D Superset =3D >>>>=20 >>>> =3D=3D Abstract =3D=3D >>>> Superset is an enterprise-ready web application for data = exploration, >> data >>>> visualization and dashboarding. >>>>=20 >>>> =3D=3D Proposal =3D=3D >>>> Superset is business intelligence (BI) software that helps modern >>>> organizations visualize and interact with their data. Superset = enables >>>> users explore data from a variety of databases, assemble beautiful >>>> dashboards and share their findings. Superset works neatly with = all >>>> modern >>>> SQL-speaking databases, and integrates with Druid.io to provide >> real-time, >>>> interactive, blazing fast data access to large datasets. >>>>=20 >>>> =3D=3D Background =3D=3D >>>> Data is mission critical. To succeed in this era, organizations = need to >>>> provide low-friction, intuitive and interactive access to data. It = is >>>> paramount for knowledge workers to be capable of answering their = own >>>> questions by querying, exploring and visualizing data. >>>>=20 >>>> The entire business intelligence industry has pivoted from a model = of >>>> centralized top-down platforms driven by IT organizations to >> self-service >>>> analytics and agile workflows by any user. This shift unblocks >>>> centralized >>>> service bottlenecks for creating data visualizations while also = creating >>>> an >>>> environment that is iterative and fast-moving. This means that = business >>>> intelligence software must also be easy and delightful to use. >>>> Self-service analytics doesn=E2=80=99t mean that admin and = governance features >> are >>>> not needed. >>>> Modern BI tools provide fine-grain access controls and auditing >>>> capabilities to understand how data is being used. Superset is a >> solution >>>> that delivers on all of these vectors. >>>>=20 >>>> The technology stack is also constantly morphing - vendors are >> struggling >>>> to provide cheap, quick and easy solutions to access data. = Business >>>> intelligence users are finding existing solutions lacking as these >>>> software >>>> products either disregard or react slowly to recent game-changing >>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, = d3.js, >>>> React.js and iPython=E2=80=99s Jupyter for instance. >>>>=20 >>>> =3D=3D Rationale =3D=3D >>>> Business intelligence is more relevant today than at any other = point in >>>> history. Organizations are currently very limited in options for = open >>>> source data visualization solutions, especially solutions that are = both >>>> self-service and enterprise-ready. Every company informing their >>>> decisions >>>> with data needs a BI tool. >>>>=20 >>>> We believe that Superset will be a strong compliment to existing = Apache >>>> Software Foundation technologies by offering scalable user = interactions >> to >>>> distributed storage and computation solutions. Users will often = find >> that >>>> Superset can act as a catalyst for tooling that can visualize the >>>> byproduct >>>> of data and computation infrastructure. >>>>=20 >>>> Superset has many key design elements that help fill a gap in = current >>>> solutions for organizations: >>>> * Easy, low friction access to data through a simple, web-based = data >>>> exploration interface. Composing charts and dashboards are = intuitive. >>>> Eliminating the need to write code or SQL empowers anyone to use = it. >>>> * Access to a wide array of rich, interactive data visualization = types. >>>> * Enterprise-ready: Integration with different authentication >> mechanisms >>>> and granular permissions centered around actions and data access. >>>> * Realtime & fast: Superset provides realtime analytics at the = speed of >>>> thought on very large datasets when integrated with Druid.io. >>>> * Broad data access: Consume data out of any SQL-speaking = relational >>>> database. >>>> * Extensible: Can be extended to talk to many noSQL databases like >> Apache >>>> Drill, Elastic Search, and other popular database engines. >>>> * Fast loading dashboards with configurable web-scale caching. >>>> * Plug-in framework that enables organizations to build custom >> analytical >>>> applications with new UI/UX interfaces. >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking = users >>>> with more flexibility. SQL Lab integrates with the visualization = engine >>>> seamlessly. >>>>=20 >>>> =3D=3D Initial Goals =3D=3D >>>> The initial goals of the Superset project are several-fold: >>>> * Move the existing codebase to Apache and integrate with the = Apache >>>> development process. >>>> * Redesign the user interface and interaction model for creating >>>> visualizations/dashboards and connecting to data sources >>>> * Build robust support for security and governance of the tool >> including >>>> popular authorization modules (including Apache Ranger and Apache >> Sentry) >>>> and a more sophisticated permissions system >>>> * Grow the extensibility of the project both in terms of enhanced >>>> connectivity to NoSQL-based data sources and creating a plug-in >> framework >>>> that enables organizations to build custom analytical applications = which >>>> require a new UI/UX >>>>=20 >>>> =3D=3D Current Status =3D=3D >>>> By many standards, Superset is already a successful open source = project. >>>> As >>>> of March 2017, Superset is officially used in production at about a >> dozen >>>> companies, has received contributions from over one hundred = contributors >>>> on >>>> Github, 1500+ forks, and 12k+ stars. >>>>=20 >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made >>>> significant contributions, and expressed their commitment to the >> project. >>>> The product is feature complete and has been viable for months. It >> already >>>> serves as the main interface for consuming data at many companies = of >>>> different sizes. >>>>=20 >>>> While the product is usable, there=E2=80=99s room for improvement = across the >>>> board, >>>> starting with providing a smoother user experience around content >>>> creation, >>>> making sure all features work out-of-the-box on more platforms and >>>> databases, providing better user training guides and videos, having = a >>>> predictable release process, and increasing the overall quality of = the >>>> Superset releases. >>>>=20 >>>> =3D=3D=3D Meritocracy =3D=3D=3D >>>> We plan to invest in supporting a meritocracy. We will discuss the >>>> requirements in an open forum. Several companies have expressed = interest >>>> in >>>> this project, and we intend to invite additional developers to >>>> participate. >>>> We will encourage and monitor community participation so that = privileges >>>> can be extended to those that contribute. >>>>=20 >>>> =3D=3D=3D Community =3D=3D=3D >>>> The need for an enterprise-ready data visualization and exploration >>>> platform in the open source community is tremendous. While = Superset is >>>> fairly well known, recognized and used within the Druid.io = community, >>>> adoption is currently limited outside of that niche. There is a = huge >>>> opportunity to grow the community to hundreds if not thousands of >>>> organizations, and we are hoping that embracing =E2=80=9Cthe Apache = way=E2=80=9D will >>>> accelerate the growth of our community. >>>>=20 >>>> We have already been active at seeking and inviting contributions, = and >> are >>>> planning to scale the project by investing time and growing the = support >>>> structure to grow the community. >>>>=20 >>>> =3D=3D=3D Core Developers =3D=3D=3D >>>> The initial committers for Superset include experienced full stack, >>>> front-end and data engineers: >>>> * Maxime Beauchemin (Airbnb) >>>> * Alanna Scott (Airbnb) >>>> * Bogdan Kyryliuk (Airbnb) >>>> * Vera Liu (Airbnb) >>>> * Jeff Feng (Airbnb) >>>> * Ashutosh Chauhan (Hortonworks) >>>> * Nishant Bangarwa (Hortonworks) >>>> * Slim Bouguerra (Hortonworks) >>>> * Priyank Shah (Hortonworks) >>>> * Sriharsha Chintalapani (Hortonworks) >>>> * Daniel Dai (Hortonworks) >>>>=20 >>>> We realize that additional employer diversity is needed, and we = will >> work >>>> aggressively to recruit developers from additional companies. >>>>=20 >>>> =3D=3D=3D Alignment =3D=3D=3D >>>> The initial committers strongly believe that a system for = interactive >>>> visualization of data will gain broader adoption as an open source, >>>> community driven project, where the community can contribute not = only to >>>> the core components, but also to a growing collection of = connectors, >>>> visualizations and improving integration a all potential data = sources. >>>> Superset already integrates closely with Apache Hive, the Hive >> metastore, >>>> as well as most SQL-speaking databases found in modern data = ecosystems. >>>>=20 >>>> =3D=3D Known Risks =3D=3D >>>>=20 >>>> =3D=3D=3D Orphaned Products =3D=3D=3D >>>> Superset is a vital component for both visualizing, accessing and >>>> democratizing data at Airbnb. Also at Hortonworks, Superset is a = core >>>> component of the DataFlow product offering. Thus, the risk of the >> project >>>> being orphaned is relatively low. The project could be at risk if >> Airbnb >>>> changes their approach for democratizing data or if Hortonworks = changes >>>> their strategy in the market. In such an event, the committers = plan to >>>> continue working on the project on their own time, thought the = progress >>>> will likely be slower. We plan to mitigate this risk by recruiting >>>> additional committers. >>>>=20 >>>> =3D=3D=3D Inexperience with Open Source =3D=3D=3D >>>> The initial committers include veteran Apache members (committers = and >> PPMC >>>> members) and other developers who have varying degrees of = experience >> with >>>> open source projects. All have been involved with source code that = has >>>> been >>>> released under an open source license, and several also have = experience >>>> developing code with an open source development process. >>>>=20 >>>> =3D=3D=3D Homogenous Developers =3D=3D=3D >>>> The initial committers are employed by Airbnb Inc. and Hortonworks. = We >> are >>>> committed to recruiting additional committers from other companies. >>>>=20 >>>> =3D=3D=3D Reliance on Salaried Developers =3D=3D=3D >>>> It is expected that Superset development will occur on both = salaried >> time >>>> and on volunteer time, after hours. The majority of initial = committers >> are >>>> paid by their employer to contribute to this project. However, they = are >>>> all >>>> passionate about the project, and we are confident that the project = will >>>> continue even if no salaried developers contribute to the project. = We >> are >>>> committed to recruiting additional committers including = non-salaried >>>> developers. >>>>=20 >>>> =3D=3D=3D Relationships with Other Apache Products =3D=3D=3D >>>> To the knowledge of the Initial Committers, there are no direct >>>> competitors >>>> to Superset within the Apache Software Foundation. That said, = Apache >>>> Zeppelin is an indirect competitor, but it solves a different use = case. >>>>=20 >>>> Apache Zeppelin is a web-based notebook that enables interactive = data >>>> analytics. It enables the creation of beautiful data-driven, = interactive >>>> and collaborative documents with SQL, Scala and more. Although a = user >> can >>>> create data visualizations using this project, it leverages a = notebook >>>> style user interfaces and it is geared towards the Spark community = where >>>> Scala and SQL co-exist >>>>=20 >>>> We look forward to collaborating with those communities, as well as >> other >>>> Apache communities. >>>>=20 >>>> =3D=3D=3D An Excessive Fascination with the Apache Brand =3D=3D=3D >>>> Superset is solving two huge challenges: >>>> The challenge of enabling every knowledge worker to make data = informed >>>> decisions, particularly those who are not deeply skilled at writing = SQL. >>>> The challenge of visualizing huge amounts of data interactively and = in >>>> real-time >>>>=20 >>>> Superset was first developed as a data visualization solution for >> Druid.io >>>> as a way to visualize billions of rows of data. Since then, usage = of >>>> Superset has expanded to address data visualization use cases = across SQL >>>> speaking data sources as well. >>>>=20 >>>> Our rationale for developing Superset as an Apache project is = detailed >> in >>>> the Rationale Section. We believe that the Apache brand and = community >>>> process will help us attract more contributors to this project, and = help >>>> grow the footprint of the project through usage at other = organizations >> and >>>> within other applications. Establishing consensus among users and >>>> developers will result in a more valuable tool for everyone. >>>>=20 >>>> =3D=3D Documentation =3D=3D >>>> References to further reading material: >>>> * [[http://airbnb.io/superset/|Superset Documentation]] >>>> * [[ >>>> https://medium.com/airbnb-engineering/caravel-airbnb-s-data- >>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog >>>> Post: Superset: Airbnb=E2=80=99s Data Exploration Platform]] >>>> * [[ >>>> https://medium.com/airbnb-engineering/superset-scaling-data- >>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog >>>> Post: Superset: Scaling Data Access & Visual Insights at Airbnb]] >>>>=20 >>>> =3D=3D Initial Source =3D=3D >>>> The origin of the proposed code base can be found at >>>> https://github.com/airbnb/superset. The code base is primarily in >>>> Python. >>>>=20 >>>> =3D=3D Source and Intellectual Property Submission Plan =3D=3D >>>> We do not expect any complications for the submission of the = Superset >> code >>>> base. Our code is already in Github and there is only a single = code >> base. >>>>=20 >>>> =3D=3D External Dependencies =3D=3D >>>> List of Python packages, from the Python Package Index (Pypi): >>>>=20 >>>> * boto3 >>>> * celery >>>> * cryptography >>>> * flask-appbuilder >>>> * flask-cache >>>> * flask-migrate >>>> * flask-script >>>> * flask-sqlalchemy >>>> * flask-testing >>>> * humanize >>>> * gunicorn >>>> * markdown >>>> * pandas >>>> * parsedatetime >>>> * pydruid >>>> * PyHive >>>> * python-dateutil >>>> * requests >>>> * simplejson >>>> * six >>>> * sqlalchemy >>>> * sqlalchemy-utils >>>> * sqlparse >>>> * thrift >>>> * thrift-sasl >>>> * werkzeug >>>>=20 >>>> List of Javascript packages, from NPM: >>>> * autobind-decorator >>>> * bootstrap >>>> * bootstrap-datepicker >>>> * brace >>>> * brfs >>>> * cal-heatmap >>>> * classnames >>>> * d3 >>>> * d3-cloud >>>> * d3-sankey >>>> * d3-scale >>>> * d3-tip >>>> * datamaps >>>> * datatables-bootstrap3-plugin >>>> * datatables.net-bs >>>> * font-awesome >>>> * gridster >>>> * immutability-helper >>>> * immutable >>>> * jquery >>>> * lodash.throttle >>>> * mapbox-gl >>>> * moment >>>> * moments >>>> * mustache >>>> * nvd3 >>>> * react >>>> * react-ace >>>> * react-bootstrap >>>> * react-bootstrap-table >>>> * react-dom >>>> * react-draggable >>>> * react-gravatar >>>> * react-grid-layout >>>> * react-map-gl >>>> * react-redux >>>> * react-resizable >>>> * react-select >>>> * react-syntax-highlighter >>>> * reactable >>>> * redux >>>> * redux-localstorage >>>> * redux-thunk >>>> * shortid >>>> * style-loader >>>> * supercluster >>>> * topojson >>>> * victory >>>> * viewport-mercator-project >>>>=20 >>>> =3D=3D Cryptography =3D=3D >>>> The proposal does not include cryptographic code. >>>>=20 >>>> =3D=3D Required Resources =3D=3D >>>>=20 >>>> =3D=3D=3D Mailing List =3D=3D=3D >>>> There is a current mailing list as a Google Group = =E2=80=9Cairbnb_superset=E2=80=9D that >>>> we >>>> are planning on deprecating as the Apache.org become ready to serve = our >>>> community. >>>>=20 >>>> * superset-private >>>> * superset-dev >>>> * superset-user >>>>=20 >>>> =3D=3D=3D Subversion Directory =3D=3D=3D >>>> Git is the preferred source control system. >>>> http://svn.apache.org/repos/asf/incubator/superset >>>>=20 >>>> =3D=3D Git Repository =3D=3D >>>> Git is the preferred source control system, we=E2=80=99re assuming >>>> https://github.com/apache/incubator-superset based on the naming = scheme >>>>=20 >>>> =3D=3D Issue Tracking =3D=3D >>>> JIRA Superset (SUPERSET). If possible, we=E2=80=99d like to use = Github issues & >>>> PRs >>>> to manage our project as much as possible. It=E2=80=99s been said = that there are >>>> ways to keep Github=E2=80=99s issues in sync with Jira, allowing us = to get best >> of >>>> both worlds. If that is not possible, we will comply to using Jira. >>>>=20 >>>> =3D=3D Other Resources =3D=3D >>>> We currently use a set of Github integrated services that are free = to >> the >>>> open source community, like Travis-ci, Code Climate, Coveralls, >>>> Landscape.io, Requires.io, david-dm and Gitter. We would like to = keep >>>> using >>>> these services as they allow us to scale contributions and optimize = our >>>> development flows. These services require some elevated rights on = the >>>> Github repository in order to set up or tune and we would like for = the >>>> committers to have the required rights. >>>>=20 >>>>=20 >>>> =3D=3D Initial Committers =3D=3D >>>>=20 >>>> * Maxime Beauchemin - PPMC & = Committer >>>> * Alanna Scott - PPMC & Committer >>>> * Bogdan Kyryliuk - PPMC & Committer >>>> * Vera Liu - Committer >>>> * Jeff Feng - PPMC & Committer >>>> * Ashutosh Chauhan - Mentor & Committer >>>> * Nishant Bangarwa - PPMC & Committer >>>> * Slim Bouguerra - Committer >>>> * Priyank Shah - Committer >>>> * Harsha Chintalapani - Committer >>>> * Daniel Dai - Champion & Committer >>>> * Luke Han - Mentor >>>>=20 >>>> =3D=3D Affiliations =3D=3D >>>> The initial committers are employees of Airbnb Inc. and = Hortonworks. >>>>=20 >>>> =3D=3D Sponsors =3D=3D >>>>=20 >>>> =3D=3D=3D Champion =3D=3D=3D >>>> Daniel Dai >>>>=20 >>>> =3D=3D=3D Nominated Mentors =3D=3D=3D >>>> * Ashutosh Chauhan >>>> * Luke Han >>>>=20 >>>> =3D=3D=3D Sponsoring Entity =3D=3D=3D >>>> Incubator PMC >>>>=20 >>>=20 >>>=20 >>=20 --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org For additional commands, e-mail: general-help@incubator.apache.org