Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 376839BD9 for ; Thu, 13 Oct 2011 05:37:06 +0000 (UTC) Received: (qmail 56997 invoked by uid 500); 13 Oct 2011 05:37:05 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 56896 invoked by uid 500); 13 Oct 2011 05:37:01 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 56882 invoked by uid 99); 13 Oct 2011 05:36:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Oct 2011 05:36:57 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of vikas.srivastava@one97.net designates 209.85.213.48 as permitted sender) Received: from [209.85.213.48] (HELO mail-yw0-f48.google.com) (209.85.213.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Oct 2011 05:36:51 +0000 Received: by ywb3 with SMTP id 3so1120598ywb.35 for ; Wed, 12 Oct 2011 22:36:30 -0700 (PDT) Received: by 10.68.156.1 with SMTP id wa1mr6299429pbb.58.1318484190175; Wed, 12 Oct 2011 22:36:30 -0700 (PDT) Received: from vikassrivast ([125.63.68.99]) by mx.google.com with ESMTPS id ml4sm7693795pbc.0.2011.10.12.22.36.26 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 12 Oct 2011 22:36:28 -0700 (PDT) From: "vikas srivastava" To: References: <4F6B25AFFFCAFE44B6259A412D5F9B10395A6C29@ExchMBX104.netflix.com> In-Reply-To: <4F6B25AFFFCAFE44B6259A412D5F9B10395A6C29@ExchMBX104.netflix.com> Subject: RE: Tables not accessible after restarting Amazon EC2 instances Date: Thu, 13 Oct 2011 11:05:31 +0530 Message-ID: <4e9678dc.64d1440a.24bd.1500@mx.google.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_026A_01CC8998.08FF86E0" X-Mailer: Microsoft Office Outlook 12.0 thread-index: AcyIIvrdWQ1SNMF1R6SHNkAs3ccFCQBF5b+gAAvNRMA= Content-Language: en-us This is a multipart message in MIME format. ------=_NextPart_000_026A_01CC8998.08FF86E0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi Did you change the new host name in /etc/hosts of all the Datanodes and on hive server.?? With Regards Vikas Srivastava DWH & Analytics Team M: +91 9560885900 P: + 91 120 4770102 Email: vikas.srivastava@one97.net W: www.one97world.com One97 | Let's get talking ! From: Steven Wong [mailto:swong@netflix.com] Sent: Thursday, October 13, 2011 5:30 AM To: user@hive.apache.org Subject: RE: Tables not accessible after restarting Amazon EC2 instances Why not change the old hostnames in the metadata? You can't do that via Hive DDL, you'd have to do that to the metadata store directly. It'll be interesting to know if that fixes the rest of the problem. From: Agarwal, Ravindra (ASG) [mailto:Ravindra_Agarwal@syntelinc.com] Sent: Tuesday, October 11, 2011 7:35 AM To: user@hive.apache.org Subject: Tables not accessible after restarting Amazon EC2 instances Setup: I have setup a Hadoop cluster of 3 machines on Amazon EC2. Hive is running on top of it. Problem: Tables created on Hive are not accessible after Amazon EC2 instances are restarted (in other words - after the hosts in the cluster get renamed). Details: A table is created on the above Hive setup and loaded with data. After that, Amazon EC2 instances are stopped at the end of the day. Next day, when the instances are restarted and the table queried (something like select * from mytable), the query processor tries to connect and retrieve data from the HDFS nodes using old host names (please note Amazon instances acquire new hostname and IP address when the instance is restarted). Since the hostname has now changed, it fails to connect and throws an error. I change the masters and slaves file with new hostnames, edit mapred-site.xml and core-site.xml for the new hostnames and then run start-all.sh script to start all the Hadoop processes. It seems Hive stores the old hostname somewhere in the table metadata due to which it tries to read from that old hostname and throwing error when that hostname is not found. Kindly let me know if there is any solution to this problem or any Hive patch available to fix it. Regards, Ravi Confidential: This electronic message and all contents contain information from Syntel, Inc. which may be privileged, confidential or otherwise protected from disclosure. The information is intended to be for the addressee only. If you are not the addressee, any disclosure, copy, distribution or use of the contents of this message is prohibited. If you have received this electronic message in error, please notify the sender immediately and destroy the original message and all copies.Confidential: This electronic message and all contents contain information from Syntel, Inc. which may be privileged, confidential or otherwise protected from disclosure. The information is intended to be for the addressee only. If you are not the addressee, any disclosure, copy, distribution or use of the contents of this message is prohibited. If you have received this electronic message in error, please notify the sender immediately and destroy the original message and all copies. ------=_NextPart_000_026A_01CC8998.08FF86E0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi

 

Did you change the new host name in /etc/hosts of all the Datanodes and on hive server.??

 

With Regards
Vikas Srivastava

DWH & Analytics Team
M: +91 9560885900
P: + 91 120 4770102
Email: vikas.srivastava@one97.net
W: www.one97world.com

One97 | Let's get talking
!

 

From:= Steven = Wong [mailto:swong@netflix.com]
Sent: Thursday, October 13, 2011 5:30 AM
To: user@hive.apache.org
Subject: RE: Tables not accessible after restarting Amazon EC2 = instances

 

Why not change the = old hostnames in the metadata? You can’t do that via Hive DDL, you’d have = to do that to the metadata store directly.

 

It’ll be = interesting to know if that fixes the rest of the problem.

 

 

From:= Agarwal, = Ravindra (ASG) [mailto:Ravindra_Agarwal@syntelinc.com]
Sent: Tuesday, October 11, 2011 7:35 AM
To: user@hive.apache.org
Subject: Tables not accessible after restarting Amazon EC2 = instances

 

Setup: I have setup a Hadoop cluster of 3 = machines on Amazon EC2. Hive is running on top of it.

 

Problem: Tables created on Hive are not = accessible after Amazon EC2 instances are restarted (in other words – after = the hosts in the cluster get renamed).

 

Details:

 

A table is created on the above Hive setup and = loaded with data. After that, Amazon EC2 instances are stopped at the end of the = day. Next day, when the instances are restarted and the table queried (something = like select * from mytable), the query processor tries to connect and = retrieve data from the HDFS nodes using old host names (please note Amazon instances = acquire new hostname and IP address when the instance is restarted). Since the = hostname has now changed, it fails to connect and throws an error.

 

I change the masters and slaves file with new = hostnames, edit mapred-site.xml and core-site.xml for the new hostnames and then = run start-all.sh script to start all the Hadoop processes. It seems Hive = stores the old hostname somewhere in the table metadata due to which it tries to = read from that old hostname and throwing error when that hostname is not = found.

 

Kindly let me know if there is any solution to this = problem or any Hive patch available to fix it.

 

Regards,

Ravi

 

 

Confidential: This electronic message and all contents contain information from = Syntel, Inc. which may be privileged, confidential or otherwise protected from = disclosure. The information is intended to be for the addressee only. If you are not = the addressee, any disclosure, copy, distribution or use of the contents of = this message is prohibited. If you have received this electronic message in = error, please notify the sender immediately and destroy the original message = and all copies.Confidential: This electronic message and all contents contain information from Syntel, Inc. which may be privileged, confidential or otherwise protected from disclosure. The information is intended to be = for the addressee only. If you are not the addressee, any disclosure, copy, distribution or use of the contents of this message is prohibited. If = you have received this electronic message in error, please notify the sender = immediately and destroy the original message and all copies.

------=_NextPart_000_026A_01CC8998.08FF86E0--