incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <seb...@gmail.com>
Subject Re: [VOTE] Accept Lens into the Apache Incubator (earlier called Grill)
Date Thu, 16 Oct 2014 20:13:21 GMT
It looks like the version [1] of the Wiki page that was active at the
time of the vote was:

https://wiki.apache.org/incubator/LensProposal?rev=4

[1] https://wiki.apache.org/incubator/LensProposal?action=info

For the record, the current contents of that version are appended below:

-------------- cut here ------------

Lens

Abstract

Lens is a platform that enables multi-dimensional queries in a unified
way over datasets stored in multiple warehouses. Lens integrates
Apache Hive with other data warehouses by tiering them together to
form logical data cubes.

Proposal

Lens provides a unified Cube abstraction for data stored in different
stores. Lens tiers multiple data warehouses for unified representation
and efficient access. It provides SQL-like Cube query language to
query and describe data sets organized in data cubes. It enables users
to run queries against Facts and Dimensions that can span multiple
physical tables stored in different stores.

The primary use cases that Lens aims to solve:

Facilitate analytical queries by providing the OLAP like Cube abstraction
Data Discovery by providing single metadata layer for data stored in
different stores
Unified access to data by integrating Hive with other traditional data
warehouses

Background

Apache Hive is a data warehouse that facilitates querying and managing
large datasets stored in distributed storage systems like HDFS. It
provides SQL like language called HiveQL aka HQL. Apache Hive is a
widely used platform in various organizations for doing adhoc
analytical queries. In a typical Data warehouse scenario, the data is
multi-dimensional and organized into Facts and Dimensions to form Data
Cubes. Lens provides this logical layer to enable querying and manage
data as Cubes. The Lens project is actively being developed at InMobi
to provide the higher level of analytical abstraction to query data
stored in different storages including Hive and beyond seamlessly.

Rationale

The Lens project aims to ease the analytical querying capabilities and
cut the data-silos by providing a single view of data across multiple
data stores. Conceiving data as a cube with hierarchical dimensions
leads to conceptually straightforward operations to facilitate
analysis. Integrating Apache Hive with other traditional warehouses
provides the opportunity to optimize on the query execution cost by
tiering the data across multiple warehouses. Lens provides

Access to data Cubes via Cube Query language similar to HiveQL.
Driver based architecture to allow for plugging systems like Hive and
other warehouses such as columnar data RDBMS.
Cost based engine selection that provides optimal use of resources by
selecting the best execution engine for a given query.

In a typical Data warehouse, data is organized in Cubes with multiple
dimensions and measures. This facilitates the analysis by conceiving
the data in terms of Facts and Dimensions instead of physical tables.
Lens aims to provide this logical Cube abstraction on Data warehouses
like Hive and other traditional warehouses.

Initial Goals

Donate the Lens source code and documentation to Apache Software Foundation
Build a user and developer community
Support Hive and other Columnar data warehouses
Support full query life cycle management
Add authentication for querying cubes
Provide detailed query statistics

Long Term Goals

Here are some longer-term capabilities that would be added to Lens

Add authorization for managing and querying Cubes
Provide REST and CLI for full Admin controls
Capability to schedule queries
Query caching
Integrate with Apache Spark. Creating Spark RDD from Lens query
Integrate with Apache Optiq

Current Status

The project is actively developed at InMobi. The first version is
deployed at InMobi 4 months back. This version allows querying
dimension and fact data stored in Hive over CLI. The source code and
documentation is hosted at GitHub.

Meritocracy

We intend to build a diverse developer and user community for the
project following the Apache meritocracy model. We want to encourage
contributors from multiple organizations, provide plenty of support to
new developers and welcome them to be committers.

Community

Currently the project is being developed at InMobi. We hope to extend
our contributor and user base significantly in the future and build a
solid open source community around Lens. Core Developers Lens is
currently being developed by Amareshwari Sriramadasu, Sharad Agarwal
and Jaideep Dhok from InMobi, and Sreekanth Ramakrishnan who is
currently employed by SoftwareAG. Raghavendra Singh from InMobi has
built the QA automation for Lens.

Alignment

The ASF is a natural home to Lens as it is for Apache Hadoop, Apache
Hive, Apache Spark and other emerging projects in Big Data space. We
believe in any enterprise, multiple data warehouses will co-exist, as
not all workloads are cost effective to run on single one. Apache Hive
is one of the crucial data warehouse along with upcoming projects like
Apache Spark in Hadoop ecosystem. Lens will benefit in working in
close proximity with these projects. The traditional Columnar data
warehouses complement Apache Hive as certain workloads continue to be
cost effective to run in traditional columnar data warehouses. Having
multiple data warehouses leads to data silos that Lens aims to cut
within the enterprise and provide a holistic unified access to data.

Known Risks

Orphaned products & Reliance on Salaried Developers

There is little risk of Lens getting orphaned, as Lens is key part of
the Data Platform stack at InMobi. The core Lens developers plan to
work on it full-time. We think Lens will bring value in the Big Data
space and we plan to grow the community of users and contributors.

Inexperience with Open Source

All the core developers have long and significant experience in Apache
projects and Hadoop ecosystem. Amareshwari Sriramadasu has long
standing contributions to Apache Hadoop MapReduce and Apache Hive, she
being PMC member of Hadoop and a committer of Hive. Sharad Agarwal is
a PMC member of Hadoop and contributed to Hadoop YARN and Hadoop
MapReduce. Srikanth Sundarrajan is a PMC member of Apache Falcon.
Sreekanth Ramakrishnan is committer of Apache Hadoop. Jaideep Dhok has
contributed patches to Apache Hive. Gunther is a PMC member of Apache
Hive. Vikram is a committer of Apache Hive.

Homogeneous Developers

The initial developers are employed by Hortonworks, InMobi and
SoftwareAG. We are committed to recruiting additional committers from
other companies based on their contribution to the project.

Reliance on Salaried Developers

The majority of initial committers are paid by their employee to
contribute to the project and few are contributing in their spare
time. Once the project has a community built, we are committed to
recruit committers and developers from outside the current core
developers.

Relationships with Other Apache Products

Lens is deeply integrated with other Apache projects. Lens uses and
extends Apache Hive HCatalog to store and manage the Data cubes. It
uses HDFS and Hive session management libraries. Lens has the
driver-based architecture that allows for adding multiple execution
drivers. Apart from integrating Apache Hive, it can be integrated with
Apache Spark over Spark SQL or Shark, Apache Drill, Apache Tajo and
Apache Phoenix. In future we want to use Apache Optiq in Lens for
query optimization and cost based driver selection.

An Excessive Fascination with the Apache Brand

The project is conceived from beginning to be in line with the Apache
philosophy. As the core developers have good experience with Apache,
the source code organization, build, review and commit process are
highly influenced by Apache. We believe that Apache will be a solid
home for Lens to grow and build the open source community. We have
also described the reasons in the Rationale and Alignment sections.

Documentation

http://inmobi.github.io/grill/

Initial Source

The source is currently in github repository at: https://github.com/inmobi/grill

Source and Intellectual Property Submission Plan

The complete Lens code is already under Apache Software License 2.

External Dependencies

The dependencies all have Apache compatible licenses. These include
Apache 2.0, BSD, MIT, EPL and CDDL licensed dependencies.

Cryptography

None

Required Resources

Mailing lists

lens-dev AT incubator DOT apache DOT org
lens-commits AT incubator DOT apache DOT org
lens-private AT incubator DOT apache DOT org

Subversion Directory

Git is the preferred source control system: git:// git.apache.org/incubator-lens

Issue Tracking

JIRA Lens (LENS)

Initial Committers

Amareshwari Sriramadasu (amareshwari AT apache DOT org)
Gunther Hagleitner (gunther AT apache DOT org)
Jaideep Dhok (jaideep.dhok AT Inmobi DOT com)
Raghavendra Singh (raghavendra.singh AT Inmobi DOT com)
Sharad Agarwal (sharad AT apache DOT org)
Sreekanth Ramakrishnan (sreekanth AT apache DOT org)
Srikanth Sundarrajan (sriksun AT apache DOT org)
Suma Shivaprasad (suma.shivaprasad AT Inmobi DOT com)
Vikram Dixit (vikram AT apache DOT org)

Affiliations

Amareshwari SR (InMobi)

Gunther Hagleitner (Hortonworks)

Jaideep Dhok (InMobi)

Raghavendra Singh (InMobi)

Sharad Agarwal (InMobi)

Sreekanth Ramakrishnan (SoftwareAG)

Srikanth Sundarrajan (InMobi)

Suma Shivaprasad (InMobi)

Vikram Dixit (Hortonworks)

Sponsors

Champion

Vinod K <vinodkv AT apache DOT org> (Apache Member)

Nominated Mentors

Chris Douglas (Microsoft)
Jacob Homan (Microsoft)
Jean Baptiste Onofre (Talend)
Vinod K (Hortonworks)

Sponsoring Entity

Incubator PMC

LensProposal (last edited 2014-09-23 08:22:15 by AmareshwariSriramadasu)


-------------- cut here ------------


On 6 October 2014 12:51, Sharad Agarwal <sharad@apache.org> wrote:
> Following the discussion earlier in the thread
> https://www.mail-archive.com/general@incubator.apache.org/msg45208.html
> I would like to call a Vote for accepting Lens as a new incubator project.
>
> The proposal is available at:
> https://wiki.apache.org/incubator/LensProposal
>
> Vote is open till Oct 09, 2014 4 PM PST.
>
>  [ ] +1 accept Lens in the Incubator
>  [ ] +/-0
>  [ ] -1 because...
>
> Only Votes from Incubator PMC members are binding, but all are welcome to
> express their thoughts.
> I am +1 (non-binding).
>
> Thanks
> Sharad

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message