hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tzur Turkenitz" <>
Subject RE: Hive Architecture - Execution on nodes
Date Mon, 22 Jul 2013 14:30:50 GMT
Thank you Alan.

-----Original Message-----
From: Alan Gates [] 
Sent: Thursday, July 18, 2013 5:45 PM
Subject: Re: Hive Architecture - Execution on nodes

On Jul 18, 2013, at 1:40 PM, Tzur Turkenitz wrote:

> Hello,
> Just finished reading the Hive-Architecture pdf, and failed to find the
answers I was hoping for. So here I am, hoping this community will shed some
> I think I know what the answers will be, I need that bolted down and
> We are concerned on how data is transferred between data-nodes and hive,
especially when it comes to clusters were there's no SSL between nodes.
> And this is the user-case:
> 1.       Table employee is a Hive table, with SerDe
> 2.       MapReduce job accesses the table Employees which holds Encrypted
> 3.       SerDe decrypts the data
> 4.       Post-SerDe output is returned to the MapReduce job and saved to a
new Hive table using a new Encryption implementation
> The flow, as I think it currently is should be:
> MapReduce Job -- > Read table metadata -- > SerDe creates map-reduce job
-- > distributes across nodes
> Which means that data is decrypted on the local nodes and then sent in
clear-text back to the original map-reduce job to be saved in a new table.
> Is that correct? L

No.  Data deserialization (which is what a serde does, not decryption) is
done as part of reading in the map reduce job.  Mainly only query parsing,
validation, and planning is done on the client node.


View raw message