Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7FCABF36E for ; Fri, 29 Mar 2013 05:48:10 +0000 (UTC) Received: (qmail 95885 invoked by uid 500); 29 Mar 2013 05:48:08 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 95815 invoked by uid 500); 29 Mar 2013 05:48:08 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 95787 invoked by uid 99); 29 Mar 2013 05:48:07 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Mar 2013 05:48:07 +0000 Received: from localhost (HELO mail-ve0-f172.google.com) (127.0.0.1) (smtp-auth username omalley, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Mar 2013 05:48:07 +0000 Received: by mail-ve0-f172.google.com with SMTP id oz10so265580veb.17 for ; Thu, 28 Mar 2013 22:48:06 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.52.92.225 with SMTP id cp1mr855326vdb.41.1364536086067; Thu, 28 Mar 2013 22:48:06 -0700 (PDT) Received: by 10.52.177.162 with HTTP; Thu, 28 Mar 2013 22:48:05 -0700 (PDT) In-Reply-To: References:

Date: Thu, 28 Mar 2013 22:48:05 -0700 Message-ID: Subject: Re: Optimizing hive queries From: "Owen O'Malley" To: user@hive.apache.org Content-Type: multipart/alternative; boundary=20cf307f3b34b1b8c004d909d05e --20cf307f3b34b1b8c004d909d05e Content-Type: text/plain; charset=UTF-8 Actually, Hive already has the ability to have different schemas for different partitions. (Although of course it would be nice to have the alter table be more flexible!) The "versioned metadata" means that the ORC file's metadata is stored in ProtoBufs so that we can add (or remove) fields to the metadata. That means that for some changes to ORC file format we can provide both forward and backward compatibility. -- Owen On Thu, Mar 28, 2013 at 10:25 PM, Jagat Singh wrote: > Hello Nitin, > > Thanks for sharing. > > Do we have more details on > > Versioned metadata feature of ORC ? , is it like handling varying schemas > in Hive? > > Regards, > > Jagat Singh > > > > On Fri, Mar 29, 2013 at 4:16 PM, Nitin Pawar wrote: > >> >> Hi, >> >> Here is is a nice presentation from Owen from Hortonworks on "Optimizing >> hive queries" >> >> http://www.slideshare.net/oom65/optimize-hivequeriespptx >> >> >> >> Thanks, >> Nitin Pawar >> > > --20cf307f3b34b1b8c004d909d05e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Actually, Hive already has the ability to have different s= chemas for different partitions. (Although of course it would be nice to ha= ve the alter table be more=C2=A0flexible!)

The &qu= ot;versioned metadata" means that the ORC file's metadata is store= d in ProtoBufs so that we can add (or remove) fields to the metadata. That = means that for some changes to ORC file format we can provide both forward = and backward compatibility.

-- Owen

On Thu, Mar 28, 2013 at 10:25 PM, Jag= at Singh <jagatsingh@gmail.com> wrote:

Hello Nitin,

Thanks for sharing.
<= br>Do we have more details on

Versioned metadata feature of ORC ? , = is it like handling varying schemas in Hive?

Regards,

Jagat Singh

=

On Fri, Mar 29, 2013 at 4:16 PM, Nitin Pawar <nitinpawar432@gmail.c= om> wrote:

Hi,

Here i= s is a nice presentation from Owen from Hortonworks on "Optimizing hiv= e queries"

http://www.slideshare.= net/oom65/optimize-hivequeriespptx

Thanks,
Nitin Pawar

--20cf307f3b34b1b8c004d909d05e--