Return-Path: X-Original-To: apmail-drill-dev-archive@www.apache.org Delivered-To: apmail-drill-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0D385176D4 for ; Tue, 27 Jan 2015 00:42:35 +0000 (UTC) Received: (qmail 12151 invoked by uid 500); 27 Jan 2015 00:42:35 -0000 Delivered-To: apmail-drill-dev-archive@drill.apache.org Received: (qmail 12082 invoked by uid 500); 27 Jan 2015 00:42:35 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 11819 invoked by uid 99); 27 Jan 2015 00:42:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Jan 2015 00:42:35 +0000 Date: Tue, 27 Jan 2015 00:42:35 +0000 (UTC) From: "Jason Altekruse (JIRA)" To: dev@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (DRILL-2077) Provide a clear starting point for new developers about what to start reading to learn about Drill MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Jason Altekruse created DRILL-2077: -------------------------------------- Summary: Provide a clear starting point for new developers about what to start reading to learn about Drill Key: DRILL-2077 URL: https://issues.apache.org/jira/browse/DRILL-2077 Project: Apache Drill Issue Type: Improvement Reporter: Jason Altekruse Assignee: Jason Altekruse As part of my package level javadocs posted in DRILL-1904 I tried to document the root org.apache.drill.exec package. We should have some good information here as well as in the markdown file on the git repo about the best place to start reading the code to understand how drill operates. Here is a description I started. I think we want to make sure this is informative but concise. I want to get in the rest of the package docs, so I am leaving this here as a TODO, please feel free to comment, revise or add to this. {code} * A good place to start learning about Drill is exploring the query plans. A * Drill physical plan is defined as a connected graph of operators that read * and manipulate data. Operators are configured by implementations of the {@See * PhysicalOperator} interface. These query graphs are translated into a graph * of physical operators that will actually process data at query execution * time. The connections between these nodes are materialized as interfaces * where data is passed between different operators. As Drill is distributed * these connections can take the form of an RPC layer between the nodes in a * Drill cluster. * * While physical plans can be written by hand, the primary interface for Drill * is SQL. Drill is targeted for compliance with the ANSI SQL 2003 * specification. Query parsing and optimization is handled by Calcite, an * Apache incubator project, also used for planning in Apache Hive. Drill * defines many planning rules an optimizations that plug into the Calcite * planning engine to generate optimal plans for the Drill engine. * * Unlike most query systems, Drill is designed to query raw files without * a predefined catalog of metadata defining the types of data or columns * available in the dataset. To maintain performance in a flexible schema * environment, Drill uses runtime code generation to compile custom java * code as operators receive a message of change in schema. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)