hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Pullokkaran <jpullokka...@hortonworks.com>
Subject Re: Orc file and Hive Optimiser
Date Sun, 19 Apr 2015 19:53:24 GMT
If you wish to contribute to CBO, there is a CBO branch on which current dev work is being
done.
Current dev work is captured by HIVE-9132<https://issues.apache.org/jira/browse/HIVE-9132>.

Looking forward to your contributions.

Thanks
John

From: Mich Talebzadeh <mich@peridale.co.uk<mailto:mich@peridale.co.uk>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Sunday, April 19, 2015 at 12:48 PM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: RE: Orc file and Hive Optimiser

Thanks John,

I have already registered my interest on development work for Hive. So hopefully I may be
able to contribute at some level.

Regards,


Mich Talebzadeh

http://talebzadehmich.wordpress.com

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.
co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4
Publications due shortly:
Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

NOTE: The information in this email is proprietary and confidential. This message is for the
designated recipient only, if you are not the intended recipient, you should destroy it immediately.
Any information in this message shall not be understood as given or endorsed by Peridale Ltd,
its subsidiaries or their employees, unless expressly so stated. It is the responsibility
of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd,
its subsidiaries nor their employees accept any responsibility.

From: John Pullokkaran [mailto:jpullokkaran@hortonworks.com]
Sent: 19 April 2015 20:37
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: Orc file and Hive Optimiser

ORC format is transparent to CBO.
Currently we are working on a new cost model which might reflect ORC’s performance advantages
in optimization decisions.

Thanks
John

From: Mich Talebzadeh <mich@peridale.co.uk<mailto:mich@peridale.co.uk>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Sunday, April 19, 2015 at 12:32 PM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Orc file and Hive Optimiser

My understanding is that the Optimized Row Columnar (ORC) file format provides a highly efficient
way to store Hive data.

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC


In a nutshell the columnar storage allows pretty efficient compression of columns on par with
what Data Warehouses databases  like Sybase IQ provide. In short if a normal Hive table is
“Row based implementation of relational model”, then ORC is the equivalent for “Columnar
based implementation of relational model”

I find ORC file format pretty interesting as it provides a more efficient performance compared
to other Hive file formats Trying testing it). MY only question is whether the Cost Based
Optimiser (CBO) of Hive is aware of ORC storage format and it treats the table accordingly?

Finally this is more of a speculative question. If we have ORC files that provide good functionality,
is there any reason why one should deploy a columnar database such as Hbase or Cassandra If
Hive can do the job as well?

Thanks,


Mich Talebzadeh

http://talebzadehmich.wordpress.com

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.
co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4
Publications due shortly:
Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

NOTE: The information in this email is proprietary and confidential. This message is for the
designated recipient only, if you are not the intended recipient, you should destroy it immediately.
Any information in this message shall not be understood as given or endorsed by Peridale Ltd,
its subsidiaries or their employees, unless expressly so stated. It is the responsibility
of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd,
its subsidiaries nor their employees accept any responsibility.


Mime
View raw message