hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mich Talebzadeh" <>
Subject RE: reading ORC format on Spark-SQL
Date Wed, 10 Feb 2016 21:01:16 GMT


Are you encountering an issue with an ORC file in Spark-sql as opposed to reading the same
ORC with Hive on Spark engine?


The only difference would with the Spark Optimizer AKA (Catalyst) using an Orc file compared
to Hive optimiser doing the same thing.


Please clarify the underlying issue you are facing.




Dr Mich Talebzadeh


LinkedIn   <>




NOTE: The information in this email is proprietary and confidential. This message is for the
designated recipient only, if you are not the intended recipient, you should destroy it immediately.
Any information in this message shall not be understood as given or endorsed by Peridale Technology
Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility
of the recipient to ensure that this email is virus free, therefore neither Peridale Technology
Ltd, its subsidiaries nor their employees accept any responsibility.



From: Philip Lee [] 
Sent: 10 February 2016 20:39
Subject: reading ORC format on Spark-SQL


What kind of steps exists when reading ORC format on Spark-SQL?

I meant usually reading csv file is just directly reading the dataset on memory.


But I feel like Spark-SQL has some steps when reading ORC format.

For example, they have to create table to insert the dataset? and then they insert the dataset
to the table? theses steps are reading step in Spark-SQL?



View raw message