hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "vikas srivastava" <vikas.srivast...@one97.net>
Subject RE: Tables not accessible after restarting Amazon EC2 instances
Date Thu, 13 Oct 2011 05:35:31 GMT
Hi 

 

Did you change the new host name in /etc/hosts of all the Datanodes and on
hive server.??

 

With Regards
Vikas Srivastava

DWH & Analytics Team
M: +91 9560885900
P: + 91 120 4770102
Email: vikas.srivastava@one97.net
W: www.one97world.com

One97 | Let's get talking ! 

 

From: Steven Wong [mailto:swong@netflix.com] 
Sent: Thursday, October 13, 2011 5:30 AM
To: user@hive.apache.org
Subject: RE: Tables not accessible after restarting Amazon EC2 instances

 

Why not change the old hostnames in the metadata? You can't do that via Hive
DDL, you'd have to do that to the metadata store directly.

 

It'll be interesting to know if that fixes the rest of the problem.

 

 

From: Agarwal, Ravindra (ASG) [mailto:Ravindra_Agarwal@syntelinc.com] 
Sent: Tuesday, October 11, 2011 7:35 AM
To: user@hive.apache.org
Subject: Tables not accessible after restarting Amazon EC2 instances

 

Setup: I have setup a Hadoop cluster of 3 machines on Amazon EC2. Hive is
running on top of it.

 

Problem: Tables created on Hive are not accessible after Amazon EC2
instances are restarted (in other words - after the hosts in the cluster get
renamed).

 

Details:

 

A table is created on the above Hive setup and loaded with data. After that,
Amazon EC2 instances are stopped at the end of the day. Next day, when the
instances are restarted and the table queried (something like select * from
mytable), the query processor tries to connect and retrieve data from the
HDFS nodes using old host names (please note Amazon instances acquire new
hostname and IP address when the instance is restarted). Since the hostname
has now changed, it fails to connect and throws an error.

 

I change the masters and slaves file with new hostnames, edit
mapred-site.xml and core-site.xml for the new hostnames and then run
start-all.sh script to start all the Hadoop processes. It seems Hive stores
the old hostname somewhere in the table metadata due to which it tries to
read from that old hostname and throwing error when that hostname is not
found.

 

Kindly let me know if there is any solution to this problem or any Hive
patch available to fix it. 

 

Regards,

Ravi

 

 

Confidential: This electronic message and all contents contain information
from Syntel, Inc. which may be privileged, confidential or otherwise
protected from disclosure. The information is intended to be for the
addressee only. If you are not the addressee, any disclosure, copy,
distribution or use of the contents of this message is prohibited. If you
have received this electronic message in error, please notify the sender
immediately and destroy the original message and all copies.Confidential:
This electronic message and all contents contain information from Syntel,
Inc. which may be privileged, confidential or otherwise protected from
disclosure. The information is intended to be for the addressee only. If you
are not the addressee, any disclosure, copy, distribution or use of the
contents of this message is prohibited. If you have received this electronic
message in error, please notify the sender immediately and destroy the
original message and all copies.


Mime
View raw message