Return-Path: X-Original-To: apmail-drill-dev-archive@www.apache.org Delivered-To: apmail-drill-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8089317698 for ; Thu, 23 Apr 2015 22:56:20 +0000 (UTC) Received: (qmail 46921 invoked by uid 500); 23 Apr 2015 22:56:20 -0000 Delivered-To: apmail-drill-dev-archive@drill.apache.org Received: (qmail 46871 invoked by uid 500); 23 Apr 2015 22:56:20 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 46860 invoked by uid 99); 23 Apr 2015 22:56:20 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Apr 2015 22:56:20 +0000 Received: from mail-wg0-f47.google.com (mail-wg0-f47.google.com [74.125.82.47]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id CE90C1A0234 for ; Thu, 23 Apr 2015 22:56:19 +0000 (UTC) Received: by wgin8 with SMTP id n8so33547571wgi.0 for ; Thu, 23 Apr 2015 15:56:18 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.180.102.228 with SMTP id fr4mr1079708wib.4.1429829778709; Thu, 23 Apr 2015 15:56:18 -0700 (PDT) Received: by 10.28.31.86 with HTTP; Thu, 23 Apr 2015 15:56:18 -0700 (PDT) In-Reply-To: References: Date: Thu, 23 Apr 2015 15:56:18 -0700 Message-ID: Subject: Re: Should we make dir* columns only exist when requested? From: Jinfeng Ni To: dev@drill.apache.org Content-Type: multipart/alternative; boundary=14dae9cc94960ca21405146c3008 --14dae9cc94960ca21405146c3008 Content-Type: text/plain; charset=UTF-8 I think the new proposal makes sense. It makes the behavior of select * consistent, only returning the regular columns in the table, regardless how the table/file is specified in the query. On Thu, Apr 23, 2015 at 2:56 PM, Jacques Nadeau wrote: > I'm specifically arguing that SELECT * doesn't return the columns. > > Here is current behavior: > > /mytdir/mysdir/myfile.json > {a:1,b:2,c:3} > {a:4,b:5,c:6} > > select * from `myfile.json` > > a, b, c > 1, 2, 3 > 4, 5, 6 > > select * from `/mysdir/myfile.json` > > dir0 a, b, c > mysdir, 1, 2, 3 > mysdir, 4, 5, 6 > > select * from `/mytdir/mysdir/myfile.json` > > dir0, dir1 a, b, c > mytdir, mysdir, 1, 2, 3 > mytdir, mysdir, 4, 5, 6 > > > ==================================== > My proposal: > > select * from `myfile.json` > select * from `/mysdir/myfile.json` > select * from `/mytdir/mysdir/myfile.json` > ::all produce:: > a, b, c > 1, 2, 3 > 4, 5, 6 > > select dir0, a, b, c from `/mysdir/myfile.json` > > dir0 a, b, c > mysdir, 1, 2, 3 > mysdir, 4, 5, 6 > > select dir0, a, b, c from `/mytdir/mysdir/myfile.json` > > dir0 a, b, c > mytdir, 1, 2, 3 > mytdir, 4, 5, 6 > > > > > On Thu, Apr 23, 2015 at 5:42 PM, Aman Sinha wrote: > > > Seems reasonable, as long as SELECT * also returns the dir# columns. > > > > On Thu, Apr 23, 2015 at 2:34 PM, Jacques Nadeau > > wrote: > > > > > Hey guys, > > > > > > I've been thinking that always showing dir# columns seems to alter data > > > returned from Drill depending on how you select the directory. I'd > > propose > > > that we make it so that we only return dir# columns when they are > > > explicitly requested. > > > > > > Thoughts? > > > > > > --14dae9cc94960ca21405146c3008--