incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix Cheung <felixche...@apache.org>
Subject Re: [VOTE] Superset Proposal for Apache Incubator
Date Thu, 27 Apr 2017 17:02:22 GMT
+1 (nonbinding)

On Wed, Apr 26, 2017 at 11:13 PM Jeff Feng <jeff.feng@gmail.com> wrote:

> Hello everyone,
>
> Thank you for checking out our proposal on Superset and for your
> consideration for the Apache Incubator.  So far, I believe we have 8
> binding votes and 2 non-binding votes.
>
> As Taylor mentioned earlier, we made a minor update to the wording in the
> "Source and Intellectual Property Submission Plan" section based on a
> suggestion by John Ament.  The update was to help confirm the previously
> unstated assumption that we will submit an SGA.  I have copied the updated
> proposal from the wiki to the email below and highlighted (in yellow) the
> new sentence below in the document.
>
> Folks on the cc line who have already voted, please let us know if the
> change impacts your vote.
>
> Thank you all,
> Jeff
>
>
>
> = Superset =
>
> == Abstract ==
> Superset is an enterprise-ready web application for data exploration, data
> visualization and dashboarding.
>
> == Proposal ==
> Superset is business intelligence (BI) software that helps modern
> organizations visualize and interact with their data. Superset enables
> users explore data from a variety of databases, assemble beautiful
> dashboards and share their findings.  Superset works neatly with all modern
> SQL-speaking databases, and integrates with Druid.io to provide real-time,
> interactive, blazing fast data access to large datasets.
>
> == Background ==
> Data is mission critical. To succeed in this era, organizations need to
> provide low-friction, intuitive and interactive access to data. It is
> paramount for knowledge workers to be capable of answering their own
> questions by querying, exploring and visualizing data.
>
> The entire business intelligence industry has pivoted from a model of
> centralized top-down platforms driven by IT organizations to self-service
> analytics and agile workflows by any user.  This shift unblocks centralized
> service bottlenecks for creating data visualizations while also creating an
> environment that is iterative and fast-moving.  This means that business
> intelligence software must also be easy and delightful to use.
> Self-service analytics doesn’t mean that admin and governance features are
> not needed.
> Modern BI tools provide fine-grain access controls and auditing
> capabilities to understand how data is being used.  Superset is a solution
> that delivers on all of these vectors.
>
> The technology stack is also constantly morphing - vendors are struggling
> to provide cheap, quick and easy solutions to access data.  Business
> intelligence users are finding existing solutions lacking as these software
> products either disregard or react slowly to recent game-changing
> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
> React.js and iPython’s Jupyter for instance.
>
> == Rationale ==
> Business intelligence is more relevant today than at any other point in
> history.  Organizations are currently very limited in options for open
> source data visualization solutions, especially solutions that are both
> self-service and enterprise-ready.  Every company informing their decisions
> with data needs a BI tool.
>
> We believe that Superset will be a strong compliment to existing Apache
> Software Foundation technologies by offering scalable user interactions to
> distributed storage and computation solutions.  Users will often find that
> Superset can act as a catalyst for tooling that can visualize the byproduct
> of data and computation infrastructure.
>
> Superset has many key design elements that help fill a gap in current
> solutions for organizations:
>  * Easy, low friction access to data through a simple, web-based data
> exploration interface.  Composing charts and dashboards are intuitive.
> Eliminating the need to write code or SQL empowers anyone to use it.
>  * Access to a wide array of rich, interactive data visualization types.
>  * Enterprise-ready: Integration with different authentication mechanisms
> and granular permissions centered around actions and data access.
>  * Realtime & fast: Superset provides realtime analytics at the speed of
> thought on very large datasets when integrated with Druid.io.
>  * Broad data access: Consume data out of any SQL-speaking relational
> database.
>  * Extensible: Can be extended to talk to many noSQL databases like Apache
> Drill, Elastic Search, and other popular database engines.
>  * Fast loading dashboards with configurable web-scale caching.
>  * Plug-in framework that enables organizations to build custom analytical
> applications with new UI/UX interfaces.
>  * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
> with more flexibility.  SQL Lab integrates with the visualization engine
> seamlessly.
>
> == Initial Goals ==
> The initial goals of the Superset project are several-fold:
>  * Move the existing codebase to Apache and integrate with the Apache
> development process.
>  * Redesign the user interface and interaction model for creating
> visualizations/dashboards and connecting to data sources
>  * Build robust support for security and governance of the tool including
> popular authorization modules (including Apache Ranger and Apache Sentry)
> and a more sophisticated permissions system
>  * Grow the extensibility of the project both in terms of enhanced
> connectivity to NoSQL-based data sources and creating a plug-in framework
> that enables organizations to build custom analytical applications which
> require a new UI/UX
>
> == Current Status ==
> By many standards, Superset is already a successful open source project. As
> of March 2017, Superset is officially used in production at about a dozen
> companies, has received contributions from over one hundred contributors on
> Github, 1500+ forks, and 12k+ stars.
>
> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
> significant contributions, and expressed their commitment to the project.
> The product is feature complete and has been viable for months. It already
> serves as the main interface for consuming data at many companies of
> different sizes.
>
> While the product is usable, there’s room for improvement across the board,
> starting with providing a smoother user experience around content creation,
> making sure all features work out-of-the-box on more platforms and
> databases, providing better user training guides and videos, having a
> predictable release process, and increasing the overall quality of the
> Superset releases.
>
> === Meritocracy ===
> We plan to invest in supporting a meritocracy. We will discuss the
> requirements in an open forum. Several companies have expressed interest in
> this project, and we intend to invite additional developers to participate.
> We will encourage and monitor community participation so that privileges
> can be extended to those that contribute.
>
> === Community ===
> The need for an enterprise-ready data visualization and exploration
> platform in the open source community is tremendous.  While Superset is
> fairly well known, recognized and used within the Druid.io community,
> adoption is currently limited outside of that niche. There is a huge
> opportunity to grow the community to hundreds if not thousands of
> organizations, and we are hoping that embracing “the Apache way” will
> accelerate the growth of our community.
>
> We have already been active at seeking and inviting contributions, and are
> planning to scale the project by investing time and growing the support
> structure to grow the community.
>
> === Core Developers ===
> The initial committers for Superset include experienced full stack,
> front-end and data engineers:
>  * Maxime Beauchemin (Airbnb)
>  * Alanna Scott (Airbnb)
>  * Bogdan Kyryliuk (Airbnb)
>  * Vera Liu  (Airbnb)
>  * Jeff Feng (Airbnb)
>  * Ashutosh Chauhan (Hortonworks)
>  * Nishant Bangarwa (Hortonworks)
>  * Slim Bouguerra (Hortonworks)
>  * Priyank Shah (Hortonworks)
>  * Sriharsha Chintalapani (Hortonworks)
>  * Daniel Dai (Hortonworks)
>
> We realize that additional employer diversity is needed, and we will work
> aggressively to recruit developers from additional companies.
>
> === Alignment ===
> The initial committers strongly believe that a system for interactive
> visualization of data will gain broader adoption as an open source,
> community driven project, where the community can contribute not only to
> the core components, but also to a growing collection of connectors,
> visualizations and improving integration a all potential data sources.
> Superset already integrates closely with Apache Hive, the Hive metastore,
> as well as most SQL-speaking databases found in modern data ecosystems.
>
> == Known Risks ==
>
> === Orphaned Products ===
> Superset is a vital component for both visualizing, accessing and
> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
> component of the DataFlow product offering.  Thus, the risk of the project
> being orphaned is relatively low.  The project could be at risk if Airbnb
> changes their approach for democratizing data or if Hortonworks changes
> their strategy in the market.  In such an event, the committers plan to
> continue working on the project on their own time, thought the progress
> will likely be slower.  We plan to mitigate this risk by recruiting
> additional committers.
>
> === Inexperience with Open Source ===
> The initial committers include veteran Apache members (committers and PPMC
> members) and other developers who have varying degrees of experience with
> open source projects. All have been involved with source code that has been
> released under an open source license, and several also have experience
> developing code with an open source development process.
>
> === Homogenous Developers ===
> The initial committers are employed by Airbnb Inc. and Hortonworks. We are
> committed to recruiting additional committers from other companies.
>
> === Reliance on Salaried Developers ===
> It is expected that Superset development will occur on both salaried time
> and on volunteer time, after hours. The majority of initial committers are
> paid by their employer to contribute to this project. However, they are all
> passionate about the project, and we are confident that the project will
> continue even if no salaried developers contribute to the project. We are
> committed to recruiting additional committers including non-salaried
> developers.
>
> === Relationships with Other Apache Products ===
> To the knowledge of the Initial Committers, there are no direct competitors
> to Superset within the Apache Software Foundation.  That said, Apache
> Zeppelin is an indirect competitor, but it solves a different use case.
>
> Apache Zeppelin is a web-based notebook that enables interactive data
> analytics. It enables the creation of beautiful data-driven, interactive
> and collaborative documents with SQL, Scala and more.  Although a user can
> create data visualizations using this project, it leverages a notebook
> style user interfaces and it is geared towards the Spark community where
> Scala and SQL co-exist
>
> We look forward to collaborating with those communities, as well as other
> Apache communities.
>
> === An Excessive Fascination with the Apache Brand ===
> Superset is solving two huge challenges:
> The challenge of enabling every knowledge worker to make data informed
> decisions, particularly those who are not deeply skilled at writing SQL.
> The challenge of visualizing huge amounts of data interactively and in
> real-time
>
> Superset was first developed as a data visualization solution for Druid.io
> as a way to visualize billions of rows of data.  Since then, usage of
> Superset has expanded to address data visualization use cases across SQL
> speaking data sources as well.
>
> Our rationale for developing Superset as an Apache project is detailed in
> the Rationale Section.  We believe that the Apache brand and community
> process will help us attract more contributors to this project, and help
> grow the footprint of the project through usage at other organizations and
> within other applications.  Establishing consensus among users and
> developers will result in a more valuable tool for everyone.
>
> == Documentation ==
> References to further reading material:
>  * [[http://airbnb.io/superset/|Superset Documentation]]
>  * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat
> a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post:  Superset:
> Airbnb’s Data Exploration Platform]]
>  * [[https://medium.com/airbnb-engineering/superset-scaling-dat
> a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog Post:
>  Superset: Scaling Data Access & Visual Insights at Airbnb]]
>
> == Initial Source ==
> The origin of the proposed code base can be found at
> https://github.com/airbnb/superset.  The code base is primarily in Python.
>
> == Source and Intellectual Property Submission Plan ==
> Airbnb will submit a Software Grant Agreement (SGA) as Superset joins the
> incubator. We do not expect any complications for the submission of the
> Superset code base.  Our code is already in Github and there is only a
> single code base.
>
> == External Dependencies ==
> List of Python packages, from the Python Package Index (Pypi):
>
>  * boto3
>  * celery
>  * cryptography
>  * flask-appbuilder
>  * flask-cache
>  * flask-migrate
>  * flask-script
>  * flask-sqlalchemy
>  * flask-testing
>  * humanize
>  * gunicorn
>  * markdown
>  * pandas
>  * parsedatetime
>  * pydruid
>  * PyHive
>  * python-dateutil
>  * requests
>  * simplejson
>  * six
>  * sqlalchemy
>  * sqlalchemy-utils
>  * sqlparse
>  * thrift
>  * thrift-sasl
>  * werkzeug
>
> List of Javascript packages, from NPM:
>  * autobind-decorator
>  * bootstrap
>  * bootstrap-datepicker
>  * brace
>  * brfs
>  * cal-heatmap
>  * classnames
>  * d3
>  * d3-cloud
>  * d3-sankey
>  * d3-scale
>  * d3-tip
>  * datamaps
>  * datatables-bootstrap3-plugin
>  * datatables.net-bs
>  * font-awesome
>  * gridster
>  * immutability-helper
>  * immutable
>  * jquery
>  * lodash.throttle
>  * mapbox-gl
>  * moment
>  * moments
>  * mustache
>  * nvd3
>  * react
>  * react-ace
>  * react-bootstrap
>  * react-bootstrap-table
>  * react-dom
>  * react-draggable
>  * react-gravatar
>  * react-grid-layout
>  * react-map-gl
>  * react-redux
>  * react-resizable
>  * react-select
>  * react-syntax-highlighter
>  * reactable
>  * redux
>  * redux-localstorage
>  * redux-thunk
>  * shortid
>  * style-loader
>  * supercluster
>  * topojson
>  * victory
>  * viewport-mercator-project
>
> == Cryptography ==
> The proposal does not include cryptographic code.
>
> == Required Resources ==
>
> === Mailing List ===
> There is a current mailing list as a Google Group “airbnb_superset” that we
> are planning on deprecating as the Apache.org become ready to serve our
> community.
>
>  * superset-private
>  * superset-dev
>  * superset-user
>
> === Subversion Directory ===
> Git is the preferred source control system. http://svn.apache.org/repos/as
> f/incubator/superset <http://svn.apache.org/repos/asf/incubator/superset>
>
> == Git Repository ==
> Git is the preferred source control system, we’re assuming
> https://github.com/apache/incubator-superset based on the naming scheme
>
> == Issue Tracking ==
> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues & PRs
> to manage our project as much as possible. It’s been said that there are
> ways to keep Github’s issues in sync with Jira, allowing us to get best of
> both worlds. If that is not possible, we will comply to using Jira.
>
> == Other Resources ==
> We currently use a set of Github integrated services that are free to the
> open source community, like Travis-ci, Code Climate, Coveralls,
> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using
> these services as they allow us to scale contributions and optimize our
> development flows. These services require some elevated rights on the
> Github repository in order to set up or tune and we would like for the
> committers to have the required rights.
>
>
> == Initial Committers ==
>
>  * Maxime Beauchemin <maxime.beauchemin@airbnb.com> - PPMC & Committer
>  * Alanna Scott <alanna.scott@airbnb.com> - PPMC & Committer
>  * Bogdan Kyryliuk <b.kyryliuk@gmail.com> - PPMC & Committer
>  * Vera Liu <vera.liu@airbnb.com> - Committer
>  * Jeff Feng <jeff.feng@airbnb.com> - PPMC & Committer
>  * Ashutosh Chauhan <hashutosh@apache.org> - Mentor & Committer
>  * Nishant Bangarwa <nbangarwa@hortonworks.com> - PPMC & Committer
>  * Slim Bouguerra <sbouguerra@hortonworks.com> - Committer
>  * Priyank Shah <pshah@hortonworks.com> - Committer
>  * Harsha Chintalapani <schintalapani@hortonworks.com> - Committer
>  * Daniel Dai <daijy@apache.org> - Champion & Committer
>  * Luke Han <luke.han@apache.org> - Mentor
>
> == Affiliations ==
> The initial committers are employees of Airbnb Inc. and Hortonworks.
>
> == Sponsors ==
>
> === Champion ===
> Daniel Dai <daijy@apache.org>
>
> === Nominated Mentors ===
>  * Ashutosh Chauhan <hashutosh@apache.org>
>  * Luke Han <luke.han@apache.org>
>
> === Sponsoring Entity ===
> Incubator PMC
>
>
>
>
>
> On Wed, Apr 26, 2017 at 6:31 PM, Edward J. Yoon <edwardyoon@apache.org>
> wrote:
>
> > +1 binding
> >
> > On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal
> > <naresh.agarwal@gmail.com> wrote:
> > > +1 (non-binding).
> > >
> > > Thanks
> > > Naresh Agarwal
> > >
> > > On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <ted.dunning@gmail.com>
> > wrote:
> > >
> > >> +1 (binding)
> > >>
> > >>
> > >>
> > >> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <joe.witt@gmail.com> wrote:
> > >>
> > >> > +1 (binding)
> > >> >
> > >> > On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey
> > >> > <jitendra@hortonworks.com> wrote:
> > >> > > +1 (binding)
> > >> > >
> > >> > > On 4/25/17, 1:27 PM, "Julian Hyde" <jhyde@apache.org> wrote:
> > >> > >
> > >> > >     +1 binding
> > >> > >
> > >> > >     > On Apr 25, 2017, at 12:48 PM, moon soo Lee <moon@apache.org
> >
> > >> > wrote:
> > >> > >     >
> > >> > >     > +1 (non-binding)
> > >> > >     >
> > >> > >     > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <
> > >> > hashutosh@apache.org>
> > >> > >     > wrote:
> > >> > >     >
> > >> > >     >> +1 (binding)
> > >> > >     >>
> > >> > >     >> Thanks,
> > >> > >     >> Ashutosh
> > >> > >     >>
> > >> > >     >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <
> luke.hq@gmail.com
> > >
> > >> > wrote:
> > >> > >     >>
> > >> > >     >>> +1 binding
> > >> > >     >>>
> > >> > >     >>> Love to see Superset to be new incubator project.
> > >> > >     >>>
> > >> > >     >>>
> > >> > >     >>> Best Regards!
> > >> > >     >>> ---------------------
> > >> > >     >>>
> > >> > >     >>> Luke Han
> > >> > >     >>>
> > >> > >     >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <
> > >> jeff.feng@gmail.com>
> > >> > wrote:
> > >> > >     >>>
> > >> > >     >>>> Dear Apache Incubator Community,
> > >> > >     >>>>
> > >> > >     >>>> We have updated the Superset proposal
> > >> > >     >>>> <https://wiki.apache.org/incubator/SupersetProposal>
> > (copied
> > >> > below) for
> > >> > >     >>>>
> > >> > >     >>>> Apache Incubation with an additional mentor (Luke Han -
> > >> > >     >>>> luke.han@apache.org),
> > >> > >     >>>> and would like to start a vote thread for acceptance into
> > the
> > >> > incubator.
> > >> > >     >>>>
> > >> > >     >>>> Our team is excited to share Superset with the Apache
> > >> community
> > >> > and we
> > >> > >     >>>> hope
> > >> > >     >>>> for the your continued support!
> > >> > >     >>>>
> > >> > >     >>>> Cheers,
> > >> > >     >>>> Jeff & the Superset Team
> > >> > >     >>>>
> > >> > >     >>>>
> > >> > >     >>>>
> > >> > >     >>>>
> > >> > >     >>>> = Superset =
> > >> > >     >>>>
> > >> > >     >>>> == Abstract ==
> > >> > >     >>>> Superset is an enterprise-ready web application for data
> > >> > exploration,
> > >> > >     >> data
> > >> > >     >>>> visualization and dashboarding.
> > >> > >     >>>>
> > >> > >     >>>> == Proposal ==
> > >> > >     >>>> Superset is business intelligence (BI) software that
> helps
> > >> > modern
> > >> > >     >>>> organizations visualize and interact with their data.
> > Superset
> > >> > enables
> > >> > >     >>>> users explore data from a variety of databases, assemble
> > >> > beautiful
> > >> > >     >>>> dashboards and share their findings.  Superset works
> neatly
> > >> > with all
> > >> > >     >>>> modern
> > >> > >     >>>> SQL-speaking databases, and integrates with Druid.io to
> > >> provide
> > >> > >     >> real-time,
> > >> > >     >>>> interactive, blazing fast data access to large datasets.
> > >> > >     >>>>
> > >> > >     >>>> == Background ==
> > >> > >     >>>> Data is mission critical. To succeed in this era,
> > >> organizations
> > >> > need to
> > >> > >     >>>> provide low-friction, intuitive and interactive access to
> > >> data.
> > >> > It is
> > >> > >     >>>> paramount for knowledge workers to be capable of
> answering
> > >> > their own
> > >> > >     >>>> questions by querying, exploring and visualizing data.
> > >> > >     >>>>
> > >> > >     >>>> The entire business intelligence industry has pivoted
> from
> > a
> > >> > model of
> > >> > >     >>>> centralized top-down platforms driven by IT organizations
> > to
> > >> > >     >> self-service
> > >> > >     >>>> analytics and agile workflows by any user.  This shift
> > >> unblocks
> > >> > >     >>>> centralized
> > >> > >     >>>> service bottlenecks for creating data visualizations
> while
> > >> also
> > >> > creating
> > >> > >     >>>> an
> > >> > >     >>>> environment that is iterative and fast-moving.  This
> means
> > >> that
> > >> > business
> > >> > >     >>>> intelligence software must also be easy and delightful to
> > use.
> > >> > >     >>>> Self-service analytics doesn’t mean that admin and
> > governance
> > >> > features
> > >> > >     >> are
> > >> > >     >>>> not needed.
> > >> > >     >>>> Modern BI tools provide fine-grain access controls and
> > >> auditing
> > >> > >     >>>> capabilities to understand how data is being used.
> > Superset
> > >> is
> > >> > a
> > >> > >     >> solution
> > >> > >     >>>> that delivers on all of these vectors.
> > >> > >     >>>>
> > >> > >     >>>> The technology stack is also constantly morphing -
> vendors
> > are
> > >> > >     >> struggling
> > >> > >     >>>> to provide cheap, quick and easy solutions to access
> data.
> > >> > Business
> > >> > >     >>>> intelligence users are finding existing solutions lacking
> > as
> > >> > these
> > >> > >     >>>> software
> > >> > >     >>>> products either disregard or react slowly to recent
> > >> > game-changing
> > >> > >     >>>> technologies like Druid.io, PrestoDB, Apache Drill,
> Apache
> > >> > Kylin, d3.js,
> > >> > >     >>>> React.js and iPython’s Jupyter for instance.
> > >> > >     >>>>
> > >> > >     >>>> == Rationale ==
> > >> > >     >>>> Business intelligence is more relevant today than at any
> > other
> > >> > point in
> > >> > >     >>>> history.  Organizations are currently very limited in
> > options
> > >> > for open
> > >> > >     >>>> source data visualization solutions, especially solutions
> > that
> > >> > are both
> > >> > >     >>>> self-service and enterprise-ready.  Every company
> informing
> > >> > their
> > >> > >     >>>> decisions
> > >> > >     >>>> with data needs a BI tool.
> > >> > >     >>>>
> > >> > >     >>>> We believe that Superset will be a strong compliment to
> > >> > existing Apache
> > >> > >     >>>> Software Foundation technologies by offering scalable
> user
> > >> > interactions
> > >> > >     >> to
> > >> > >     >>>> distributed storage and computation solutions.  Users
> will
> > >> > often find
> > >> > >     >> that
> > >> > >     >>>> Superset can act as a catalyst for tooling that can
> > visualize
> > >> > the
> > >> > >     >>>> byproduct
> > >> > >     >>>> of data and computation infrastructure.
> > >> > >     >>>>
> > >> > >     >>>> Superset has many key design elements that help fill a
> gap
> > in
> > >> > current
> > >> > >     >>>> solutions for organizations:
> > >> > >     >>>> * Easy, low friction access to data through a simple,
> > >> web-based
> > >> > data
> > >> > >     >>>> exploration interface.  Composing charts and dashboards
> are
> > >> > intuitive.
> > >> > >     >>>> Eliminating the need to write code or SQL empowers anyone
> > to
> > >> > use it.
> > >> > >     >>>> * Access to a wide array of rich, interactive data
> > >> > visualization types.
> > >> > >     >>>> * Enterprise-ready: Integration with different
> > authentication
> > >> > >     >> mechanisms
> > >> > >     >>>> and granular permissions centered around actions and data
> > >> > access.
> > >> > >     >>>> * Realtime & fast: Superset provides realtime analytics
> at
> > the
> > >> > speed of
> > >> > >     >>>> thought on very large datasets when integrated with
> > Druid.io.
> > >> > >     >>>> * Broad data access: Consume data out of any SQL-speaking
> > >> > relational
> > >> > >     >>>> database.
> > >> > >     >>>> * Extensible: Can be extended to talk to many noSQL
> > databases
> > >> > like
> > >> > >     >> Apache
> > >> > >     >>>> Drill, Elastic Search, and other popular database
> engines.
> > >> > >     >>>> * Fast loading dashboards with configurable web-scale
> > caching.
> > >> > >     >>>> * Plug-in framework that enables organizations to build
> > custom
> > >> > >     >> analytical
> > >> > >     >>>> applications with new UI/UX interfaces.
> > >> > >     >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers
> > >> > SQL-speaking users
> > >> > >     >>>> with more flexibility.  SQL Lab integrates with the
> > >> > visualization engine
> > >> > >     >>>> seamlessly.
> > >> > >     >>>>
> > >> > >     >>>> == Initial Goals ==
> > >> > >     >>>> The initial goals of the Superset project are
> several-fold:
> > >> > >     >>>> * Move the existing codebase to Apache and integrate with
> > the
> > >> > Apache
> > >> > >     >>>> development process.
> > >> > >     >>>> * Redesign the user interface and interaction model for
> > >> creating
> > >> > >     >>>> visualizations/dashboards and connecting to data sources
> > >> > >     >>>> * Build robust support for security and governance of the
> > tool
> > >> > >     >> including
> > >> > >     >>>> popular authorization modules (including Apache Ranger
> and
> > >> > Apache
> > >> > >     >> Sentry)
> > >> > >     >>>> and a more sophisticated permissions system
> > >> > >     >>>> * Grow the extensibility of the project both in terms of
> > >> > enhanced
> > >> > >     >>>> connectivity to NoSQL-based data sources and creating a
> > >> plug-in
> > >> > >     >> framework
> > >> > >     >>>> that enables organizations to build custom analytical
> > >> > applications which
> > >> > >     >>>> require a new UI/UX
> > >> > >     >>>>
> > >> > >     >>>> == Current Status ==
> > >> > >     >>>> By many standards, Superset is already a successful open
> > >> source
> > >> > project.
> > >> > >     >>>> As
> > >> > >     >>>> of March 2017, Superset is officially used in production
> at
> > >> > about a
> > >> > >     >> dozen
> > >> > >     >>>> companies, has received contributions from over one
> hundred
> > >> > contributors
> > >> > >     >>>> on
> > >> > >     >>>> Github, 1500+ forks, and 12k+ stars.
> > >> > >     >>>>
> > >> > >     >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks
> have
> > >> made
> > >> > >     >>>> significant contributions, and expressed their commitment
> > to
> > >> the
> > >> > >     >> project.
> > >> > >     >>>> The product is feature complete and has been viable for
> > >> months.
> > >> > It
> > >> > >     >> already
> > >> > >     >>>> serves as the main interface for consuming data at many
> > >> > companies of
> > >> > >     >>>> different sizes.
> > >> > >     >>>>
> > >> > >     >>>> While the product is usable, there’s room for improvement
> > >> > across the
> > >> > >     >>>> board,
> > >> > >     >>>> starting with providing a smoother user experience around
> > >> > content
> > >> > >     >>>> creation,
> > >> > >     >>>> making sure all features work out-of-the-box on more
> > platforms
> > >> > and
> > >> > >     >>>> databases, providing better user training guides and
> > videos,
> > >> > having a
> > >> > >     >>>> predictable release process, and increasing the overall
> > >> quality
> > >> > of the
> > >> > >     >>>> Superset releases.
> > >> > >     >>>>
> > >> > >     >>>> === Meritocracy ===
> > >> > >     >>>> We plan to invest in supporting a meritocracy. We will
> > discuss
> > >> > the
> > >> > >     >>>> requirements in an open forum. Several companies have
> > >> expressed
> > >> > interest
> > >> > >     >>>> in
> > >> > >     >>>> this project, and we intend to invite additional
> > developers to
> > >> > >     >>>> participate.
> > >> > >     >>>> We will encourage and monitor community participation so
> > that
> > >> > privileges
> > >> > >     >>>> can be extended to those that contribute.
> > >> > >     >>>>
> > >> > >     >>>> === Community ===
> > >> > >     >>>> The need for an enterprise-ready data visualization and
> > >> > exploration
> > >> > >     >>>> platform in the open source community is tremendous.
> While
> > >> > Superset is
> > >> > >     >>>> fairly well known, recognized and used within the
> Druid.io
> > >> > community,
> > >> > >     >>>> adoption is currently limited outside of that niche.
> There
> > is
> > >> a
> > >> > huge
> > >> > >     >>>> opportunity to grow the community to hundreds if not
> > thousands
> > >> > of
> > >> > >     >>>> organizations, and we are hoping that embracing “the
> Apache
> > >> > way” will
> > >> > >     >>>> accelerate the growth of our community.
> > >> > >     >>>>
> > >> > >     >>>> We have already been active at seeking and inviting
> > >> > contributions, and
> > >> > >     >> are
> > >> > >     >>>> planning to scale the project by investing time and
> growing
> > >> the
> > >> > support
> > >> > >     >>>> structure to grow the community.
> > >> > >     >>>>
> > >> > >     >>>> === Core Developers ===
> > >> > >     >>>> The initial committers for Superset include experienced
> > full
> > >> > stack,
> > >> > >     >>>> front-end and data engineers:
> > >> > >     >>>> * Maxime Beauchemin (Airbnb)
> > >> > >     >>>> * Alanna Scott (Airbnb)
> > >> > >     >>>> * Bogdan Kyryliuk (Airbnb)
> > >> > >     >>>> * Vera Liu  (Airbnb)
> > >> > >     >>>> * Jeff Feng (Airbnb)
> > >> > >     >>>> * Ashutosh Chauhan (Hortonworks)
> > >> > >     >>>> * Nishant Bangarwa (Hortonworks)
> > >> > >     >>>> * Slim Bouguerra (Hortonworks)
> > >> > >     >>>> * Priyank Shah (Hortonworks)
> > >> > >     >>>> * Sriharsha Chintalapani (Hortonworks)
> > >> > >     >>>> * Daniel Dai (Hortonworks)
> > >> > >     >>>>
> > >> > >     >>>> We realize that additional employer diversity is needed,
> > and
> > >> we
> > >> > will
> > >> > >     >> work
> > >> > >     >>>> aggressively to recruit developers from additional
> > companies.
> > >> > >     >>>>
> > >> > >     >>>> === Alignment ===
> > >> > >     >>>> The initial committers strongly believe that a system for
> > >> > interactive
> > >> > >     >>>> visualization of data will gain broader adoption as an
> open
> > >> > source,
> > >> > >     >>>> community driven project, where the community can
> > contribute
> > >> > not only to
> > >> > >     >>>> the core components, but also to a growing collection of
> > >> > connectors,
> > >> > >     >>>> visualizations and improving integration a all potential
> > data
> > >> > sources.
> > >> > >     >>>> Superset already integrates closely with Apache Hive, the
> > Hive
> > >> > >     >> metastore,
> > >> > >     >>>> as well as most SQL-speaking databases found in modern
> data
> > >> > ecosystems.
> > >> > >     >>>>
> > >> > >     >>>> == Known Risks ==
> > >> > >     >>>>
> > >> > >     >>>> === Orphaned Products ===
> > >> > >     >>>> Superset is a vital component for both visualizing,
> > accessing
> > >> > and
> > >> > >     >>>> democratizing data at Airbnb.  Also at Hortonworks,
> > Superset
> > >> is
> > >> > a core
> > >> > >     >>>> component of the DataFlow product offering.  Thus, the
> > risk of
> > >> > the
> > >> > >     >> project
> > >> > >     >>>> being orphaned is relatively low.  The project could be
> at
> > >> risk
> > >> > if
> > >> > >     >> Airbnb
> > >> > >     >>>> changes their approach for democratizing data or if
> > >> Hortonworks
> > >> > changes
> > >> > >     >>>> their strategy in the market.  In such an event, the
> > >> committers
> > >> > plan to
> > >> > >     >>>> continue working on the project on their own time,
> thought
> > the
> > >> > progress
> > >> > >     >>>> will likely be slower.  We plan to mitigate this risk by
> > >> > recruiting
> > >> > >     >>>> additional committers.
> > >> > >     >>>>
> > >> > >     >>>> === Inexperience with Open Source ===
> > >> > >     >>>> The initial committers include veteran Apache members
> > >> > (committers and
> > >> > >     >> PPMC
> > >> > >     >>>> members) and other developers who have varying degrees of
> > >> > experience
> > >> > >     >> with
> > >> > >     >>>> open source projects. All have been involved with source
> > code
> > >> > that has
> > >> > >     >>>> been
> > >> > >     >>>> released under an open source license, and several also
> > have
> > >> > experience
> > >> > >     >>>> developing code with an open source development process.
> > >> > >     >>>>
> > >> > >     >>>> === Homogenous Developers ===
> > >> > >     >>>> The initial committers are employed by Airbnb Inc. and
> > >> > Hortonworks. We
> > >> > >     >> are
> > >> > >     >>>> committed to recruiting additional committers from other
> > >> > companies.
> > >> > >     >>>>
> > >> > >     >>>> === Reliance on Salaried Developers ===
> > >> > >     >>>> It is expected that Superset development will occur on
> both
> > >> > salaried
> > >> > >     >> time
> > >> > >     >>>> and on volunteer time, after hours. The majority of
> initial
> > >> > committers
> > >> > >     >> are
> > >> > >     >>>> paid by their employer to contribute to this project.
> > However,
> > >> > they are
> > >> > >     >>>> all
> > >> > >     >>>> passionate about the project, and we are confident that
> the
> > >> > project will
> > >> > >     >>>> continue even if no salaried developers contribute to the
> > >> > project. We
> > >> > >     >> are
> > >> > >     >>>> committed to recruiting additional committers including
> > >> > non-salaried
> > >> > >     >>>> developers.
> > >> > >     >>>>
> > >> > >     >>>> === Relationships with Other Apache Products ===
> > >> > >     >>>> To the knowledge of the Initial Committers, there are no
> > >> direct
> > >> > >     >>>> competitors
> > >> > >     >>>> to Superset within the Apache Software Foundation.  That
> > said,
> > >> > Apache
> > >> > >     >>>> Zeppelin is an indirect competitor, but it solves a
> > different
> > >> > use case.
> > >> > >     >>>>
> > >> > >     >>>> Apache Zeppelin is a web-based notebook that enables
> > >> > interactive data
> > >> > >     >>>> analytics. It enables the creation of beautiful
> > data-driven,
> > >> > interactive
> > >> > >     >>>> and collaborative documents with SQL, Scala and more.
> > >> Although
> > >> > a user
> > >> > >     >> can
> > >> > >     >>>> create data visualizations using this project, it
> > leverages a
> > >> > notebook
> > >> > >     >>>> style user interfaces and it is geared towards the Spark
> > >> > community where
> > >> > >     >>>> Scala and SQL co-exist
> > >> > >     >>>>
> > >> > >     >>>> We look forward to collaborating with those communities,
> as
> > >> > well as
> > >> > >     >> other
> > >> > >     >>>> Apache communities.
> > >> > >     >>>>
> > >> > >     >>>> === An Excessive Fascination with the Apache Brand ===
> > >> > >     >>>> Superset is solving two huge challenges:
> > >> > >     >>>> The challenge of enabling every knowledge worker to make
> > data
> > >> > informed
> > >> > >     >>>> decisions, particularly those who are not deeply skilled
> at
> > >> > writing SQL.
> > >> > >     >>>> The challenge of visualizing huge amounts of data
> > >> interactively
> > >> > and in
> > >> > >     >>>> real-time
> > >> > >     >>>>
> > >> > >     >>>> Superset was first developed as a data visualization
> > solution
> > >> > for
> > >> > >     >> Druid.io
> > >> > >     >>>> as a way to visualize billions of rows of data.  Since
> > then,
> > >> > usage of
> > >> > >     >>>> Superset has expanded to address data visualization use
> > cases
> > >> > across SQL
> > >> > >     >>>> speaking data sources as well.
> > >> > >     >>>>
> > >> > >     >>>> Our rationale for developing Superset as an Apache
> project
> > is
> > >> > detailed
> > >> > >     >> in
> > >> > >     >>>> the Rationale Section.  We believe that the Apache brand
> > and
> > >> > community
> > >> > >     >>>> process will help us attract more contributors to this
> > >> project,
> > >> > and help
> > >> > >     >>>> grow the footprint of the project through usage at other
> > >> > organizations
> > >> > >     >> and
> > >> > >     >>>> within other applications.  Establishing consensus among
> > users
> > >> > and
> > >> > >     >>>> developers will result in a more valuable tool for
> > everyone.
> > >> > >     >>>>
> > >> > >     >>>> == Documentation ==
> > >> > >     >>>> References to further reading material:
> > >> > >     >>>> * [[http://airbnb.io/superset/|Superset Documentation]]
> > >> > >     >>>> * [[
> > >> > >     >>>> https://medium.com/airbnb-engi
> > neering/caravel-airbnb-s-data-
> > >> > >     >>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
> > >> > >     >>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
> > >> > >     >>>> * [[
> > >> > >     >>>> https://medium.com/airbnb-engi
> > neering/superset-scaling-data-
> > >> > >     >>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.
> > >> > a505zvb1t|Blog
> > >> > >     >>>> Post:  Superset: Scaling Data Access & Visual Insights at
> > >> > Airbnb]]
> > >> > >     >>>>
> > >> > >     >>>> == Initial Source ==
> > >> > >     >>>> The origin of the proposed code base can be found at
> > >> > >     >>>> https://github.com/airbnb/superset.  The code base is
> > >> > primarily in
> > >> > >     >>>> Python.
> > >> > >     >>>>
> > >> > >     >>>> == Source and Intellectual Property Submission Plan ==
> > >> > >     >>>> We do not expect any complications for the submission of
> > the
> > >> > Superset
> > >> > >     >> code
> > >> > >     >>>> base.  Our code is already in Github and there is only a
> > >> single
> > >> > code
> > >> > >     >> base.
> > >> > >     >>>>
> > >> > >     >>>> == External Dependencies ==
> > >> > >     >>>> List of Python packages, from the Python Package Index
> > (Pypi):
> > >> > >     >>>>
> > >> > >     >>>> * boto3
> > >> > >     >>>> * celery
> > >> > >     >>>> * cryptography
> > >> > >     >>>> * flask-appbuilder
> > >> > >     >>>> * flask-cache
> > >> > >     >>>> * flask-migrate
> > >> > >     >>>> * flask-script
> > >> > >     >>>> * flask-sqlalchemy
> > >> > >     >>>> * flask-testing
> > >> > >     >>>> * humanize
> > >> > >     >>>> * gunicorn
> > >> > >     >>>> * markdown
> > >> > >     >>>> * pandas
> > >> > >     >>>> * parsedatetime
> > >> > >     >>>> * pydruid
> > >> > >     >>>> * PyHive
> > >> > >     >>>> * python-dateutil
> > >> > >     >>>> * requests
> > >> > >     >>>> * simplejson
> > >> > >     >>>> * six
> > >> > >     >>>> * sqlalchemy
> > >> > >     >>>> * sqlalchemy-utils
> > >> > >     >>>> * sqlparse
> > >> > >     >>>> * thrift
> > >> > >     >>>> * thrift-sasl
> > >> > >     >>>> * werkzeug
> > >> > >     >>>>
> > >> > >     >>>> List of Javascript packages, from NPM:
> > >> > >     >>>> * autobind-decorator
> > >> > >     >>>> * bootstrap
> > >> > >     >>>> * bootstrap-datepicker
> > >> > >     >>>> * brace
> > >> > >     >>>> * brfs
> > >> > >     >>>> * cal-heatmap
> > >> > >     >>>> * classnames
> > >> > >     >>>> * d3
> > >> > >     >>>> * d3-cloud
> > >> > >     >>>> * d3-sankey
> > >> > >     >>>> * d3-scale
> > >> > >     >>>> * d3-tip
> > >> > >     >>>> * datamaps
> > >> > >     >>>> * datatables-bootstrap3-plugin
> > >> > >     >>>> * datatables.net-bs
> > >> > >     >>>> * font-awesome
> > >> > >     >>>> * gridster
> > >> > >     >>>> * immutability-helper
> > >> > >     >>>> * immutable
> > >> > >     >>>> * jquery
> > >> > >     >>>> * lodash.throttle
> > >> > >     >>>> * mapbox-gl
> > >> > >     >>>> * moment
> > >> > >     >>>> * moments
> > >> > >     >>>> * mustache
> > >> > >     >>>> * nvd3
> > >> > >     >>>> * react
> > >> > >     >>>> * react-ace
> > >> > >     >>>> * react-bootstrap
> > >> > >     >>>> * react-bootstrap-table
> > >> > >     >>>> * react-dom
> > >> > >     >>>> * react-draggable
> > >> > >     >>>> * react-gravatar
> > >> > >     >>>> * react-grid-layout
> > >> > >     >>>> * react-map-gl
> > >> > >     >>>> * react-redux
> > >> > >     >>>> * react-resizable
> > >> > >     >>>> * react-select
> > >> > >     >>>> * react-syntax-highlighter
> > >> > >     >>>> * reactable
> > >> > >     >>>> * redux
> > >> > >     >>>> * redux-localstorage
> > >> > >     >>>> * redux-thunk
> > >> > >     >>>> * shortid
> > >> > >     >>>> * style-loader
> > >> > >     >>>> * supercluster
> > >> > >     >>>> * topojson
> > >> > >     >>>> * victory
> > >> > >     >>>> * viewport-mercator-project
> > >> > >     >>>>
> > >> > >     >>>> == Cryptography ==
> > >> > >     >>>> The proposal does not include cryptographic code.
> > >> > >     >>>>
> > >> > >     >>>> == Required Resources ==
> > >> > >     >>>>
> > >> > >     >>>> === Mailing List ===
> > >> > >     >>>> There is a current mailing list as a Google Group
> > >> > “airbnb_superset” that
> > >> > >     >>>> we
> > >> > >     >>>> are planning on deprecating as the Apache.org become
> ready
> > to
> > >> > serve our
> > >> > >     >>>> community.
> > >> > >     >>>>
> > >> > >     >>>> * superset-private
> > >> > >     >>>> * superset-dev
> > >> > >     >>>> * superset-user
> > >> > >     >>>>
> > >> > >     >>>> === Subversion Directory ===
> > >> > >     >>>> Git is the preferred source control system.
> > >> > >     >>>> http://svn.apache.org/repos/asf/incubator/superset
> > >> > >     >>>>
> > >> > >     >>>> == Git Repository ==
> > >> > >     >>>> Git is the preferred source control system, we’re
> assuming
> > >> > >     >>>> https://github.com/apache/incubator-superset based on
> the
> > >> > naming scheme
> > >> > >     >>>>
> > >> > >     >>>> == Issue Tracking ==
> > >> > >     >>>> JIRA Superset (SUPERSET). If possible, we’d like to use
> > Github
> > >> > issues &
> > >> > >     >>>> PRs
> > >> > >     >>>> to manage our project as much as possible. It’s been said
> > that
> > >> > there are
> > >> > >     >>>> ways to keep Github’s issues in sync with Jira, allowing
> > us to
> > >> > get best
> > >> > >     >> of
> > >> > >     >>>> both worlds. If that is not possible, we will comply to
> > using
> > >> > Jira.
> > >> > >     >>>>
> > >> > >     >>>> == Other Resources ==
> > >> > >     >>>> We currently use a set of Github integrated services that
> > are
> > >> > free to
> > >> > >     >> the
> > >> > >     >>>> open source community, like Travis-ci, Code Climate,
> > >> Coveralls,
> > >> > >     >>>> Landscape.io, Requires.io, david-dm and Gitter. We would
> > like
> > >> > to keep
> > >> > >     >>>> using
> > >> > >     >>>> these services as they allow us to scale contributions
> and
> > >> > optimize our
> > >> > >     >>>> development flows. These services require some elevated
> > rights
> > >> > on the
> > >> > >     >>>> Github repository in order to set up or tune and we would
> > like
> > >> > for the
> > >> > >     >>>> committers to have the required rights.
> > >> > >     >>>>
> > >> > >     >>>>
> > >> > >     >>>> == Initial Committers ==
> > >> > >     >>>>
> > >> > >     >>>> * Maxime Beauchemin <maxime.beauchemin@airbnb.com> -
> PPMC
> > &
> > >> > Committer
> > >> > >     >>>> * Alanna Scott <alanna.scott@airbnb.com> - PPMC &
> > Committer
> > >> > >     >>>> * Bogdan Kyryliuk <b.kyryliuk@gmail.com> - PPMC &
> > Committer
> > >> > >     >>>> * Vera Liu <vera.liu@airbnb.com> - Committer
> > >> > >     >>>> * Jeff Feng <jeff.feng@airbnb.com> - PPMC & Committer
> > >> > >     >>>> * Ashutosh Chauhan <hashutosh@apache.org> - Mentor &
> > >> Committer
> > >> > >     >>>> * Nishant Bangarwa <nbangarwa@hortonworks.com> - PPMC &
> > >> > Committer
> > >> > >     >>>> * Slim Bouguerra <sbouguerra@hortonworks.com> -
> Committer
> > >> > >     >>>> * Priyank Shah <pshah@hortonworks.com> - Committer
> > >> > >     >>>> * Harsha Chintalapani <schintalapani@hortonworks.com> -
> > >> > Committer
> > >> > >     >>>> * Daniel Dai <daijy@apache.org> - Champion & Committer
> > >> > >     >>>> * Luke Han <luke.han@apache.org> - Mentor
> > >> > >     >>>>
> > >> > >     >>>> == Affiliations ==
> > >> > >     >>>> The initial committers are employees of Airbnb Inc. and
> > >> > Hortonworks.
> > >> > >     >>>>
> > >> > >     >>>> == Sponsors ==
> > >> > >     >>>>
> > >> > >     >>>> === Champion ===
> > >> > >     >>>> Daniel Dai <daijy@apache.org>
> > >> > >     >>>>
> > >> > >     >>>> === Nominated Mentors ===
> > >> > >     >>>> * Ashutosh Chauhan <hashutosh@apache.org>
> > >> > >     >>>> * Luke Han <luke.han@apache.org>
> > >> > >     >>>>
> > >> > >     >>>> === Sponsoring Entity ===
> > >> > >     >>>> Incubator PMC
> > >> > >     >>>>
> > >> > >     >>>
> > >> > >     >>>
> > >> > >     >>
> > >> > >
> > >> > >
> > >> > >     ------------------------------------------------------------
> > >> > ---------
> > >> > >     To unsubscribe, e-mail: general-unsubscribe@incubator.
> > apache.org
> > >> > >     For additional commands, e-mail:
> general-help@incubator.apache.
> > org
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> >
> > >> >
> ---------------------------------------------------------------------
> > >> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > >> > For additional commands, e-mail: general-help@incubator.apache.org
> > >> >
> > >> >
> > >>
> >
> >
> >
> > --
> > Best Regards, Edward J. Yoon
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message