Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9FED310B44 for ; Thu, 27 Mar 2014 22:37:05 +0000 (UTC) Received: (qmail 14072 invoked by uid 500); 27 Mar 2014 22:37:02 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 14020 invoked by uid 500); 27 Mar 2014 22:37:02 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 14012 invoked by uid 99); 27 Mar 2014 22:37:02 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Mar 2014 22:37:02 +0000 X-ASF-Spam-Status: No, hits=3.0 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_REPLY,HTML_FONT_FACE_BAD,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of darkwolli32@gmail.com designates 209.85.214.45 as permitted sender) Received: from [209.85.214.45] (HELO mail-bk0-f45.google.com) (209.85.214.45) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Mar 2014 22:36:55 +0000 Received: by mail-bk0-f45.google.com with SMTP id na10so830517bkb.32 for ; Thu, 27 Mar 2014 15:36:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type; bh=CQIu1KgLTn0d11YQat03+FNmPeLdgAmIloYLWjZHh7U=; b=PbyXhp0ESs8EUaBFhnzTbIr7HFrh+1fgTISXDLlZm80h3vYb6oZxsQ3QoK7P9Nx5qF 53dOVRiMtded/u3qjtfBpX5WLiGeZD2vaIMwM29cNC9yhlPWu6kpp6WZRyY1J01u8Au6 Z/6476ZmoGUgOGUQS4dT7co0Dq9dW85czrCvOZPjJWJlctfptJsxVS/1K54TSrB3gMB1 w1xYSUmFtqUJbHKbVzyvCb+/rxwy4TlQL0/bxCQ0xKQFo1nf0atFUxEtqSSCu5kFBZdq LdF6aB0L6e/qyKTWzHFYg6Q9rlKMtQh+b39Iz2rWJ1EGArXgIz5fq2hvq5uVaEh645Au u6mQ== X-Received: by 10.204.180.135 with SMTP id bu7mr4605621bkb.20.1395959794309; Thu, 27 Mar 2014 15:36:34 -0700 (PDT) Received: from Wollis-MacBook-Pro.local (g229160133.adsl.alicedsl.de. [92.229.160.133]) by mx.google.com with ESMTPSA id zl9sm3312779bkb.11.2014.03.27.15.36.33 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 27 Mar 2014 15:36:33 -0700 (PDT) Message-ID: <5334A7F0.1040202@gmail.com> Date: Thu, 27 Mar 2014 23:36:32 +0100 From: fab wol User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: user@hive.apache.org Subject: Re: MSCK REPAIR TABLE References: In-Reply-To: Content-Type: multipart/alternative; boundary="------------010309070407040702000203" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------010309070407040702000203 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hey Stephen, thanks for the advice, but as i wrote in my first post, i wanted to do that anyways. But thanks for the explanation why this is indeed the best way to go for a production system ... Cheers Wolli Am 27.03.14 16:05, schrieb Stephen Sprague: > fwiw. i would not have the repair table statement as part of a > production job stream. That's kinda a poor man's way to employ > dynamic partitioning off the back end. > > Why not either use hive's dynamic partitioning features or pre-declare > your partitions? that way you are explicitly coding for your purpose > rather than running a general repair table on the backend knowing you > "broke it" up front? > > just a suggestion! > > > On Thu, Mar 27, 2014 at 3:18 AM, fab wol > wrote: > > Hey Nitin and everyone else, > > so let me tell you from memory that the Hive CLI Error was kind of > the same and nothing saying like the beeline error. Would have > been no uplift here. > > I was restarting the cluster (it is a cloud cluster provided by > http://www.unbelievable-machine.net), for getting the HiveServer2 > Log and to be very sure, that everything is well set up. During > this all tasktrackers are deleted and newly setup (HDFS and > storage is not touched at all, neither are configs). After that > the msck repair table stmt is going well and its actually not so > slow at all, as i thought it might be (ca. 110 secs per table). I > guess there might have been some logs/tmp/cache data stacked up, > and that might have caused the errors ... > > Slightly confusing, but i will post if I find out what exactly was > throwing the error here in the future ... > > Cheers for the help > Wolli > > > 2014-03-27 11:03 GMT+01:00 Nitin Pawar >: > > Without error stack, very hard to get whats wrong > > will it be possible for you to run it via hive cli and grab > some logs through there ? > > > On Thu, Mar 27, 2014 at 3:29 PM, fab wol > > wrote: > > Hey Nitin, > > HiveServer2 Log unfurtantely says nothing: > > Mon Mar 24 17:41:18 CET 2014 hiveserver2 stopped, pid 2540 > Mon Mar 24 17:43:22 CET 2014 hiveserver2 started, pid 2554 > Hive history > file=/tmp/mapr/hive_job_log_97715747-63cd-4789-9b2e-a8b0d544cdf9_2102956370 > .txt > OK > Thu Mar 27 10:52:48 CET 2014 hiveserver2 stopped, pid 2554 > Thu Mar 27 10:55:52 CET 2014 hiveserver2 started, pid 2597 > > Cheers > Wolli > > > 2014-03-27 10:04 GMT+01:00 Nitin Pawar > >: > > can you grab more logs from hiveserver2 log file? > > > On Thu, Mar 27, 2014 at 2:31 PM, fab wol > > > wrote: > > Hey everyone, > > I have a table with currently 5541 partitions. > Daily there are 14 partitions added. I will switch > the update for the metastore from "msck repair > table" to "alter table add partition", since its > performing better, but sometimes this might fail, > and i need the "msck repair table" command. But > unfortunately its not working anymore with this > table size it seems: > > 0: jdbc:hive2://clusterXYZ-> use ; > No rows affected (1.082 seconds) > 0: jdbc:hive2://clusterXYZ-> set > hive.metastore.client.socket.timeout=6000; > No rows affected (0.029 seconds) > 0: jdbc:hive2://clusterXYZ-> MSCK REPAIR TABLE > ; > Error: Error while processing statement: FAILED: > Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > (state=08S01,code=1) > Error: Error while processing statement: FAILED: > Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > (state=08S01,code=1) > > anyone had luck with getting this to work? As you > can see, I already raised the time until the > Thrift Timeout kicks in, but this error is > happening even before the time runs off ... > > Cheers > Wolli > > > > > -- > Nitin Pawar > > > > > > -- > Nitin Pawar > > > --------------010309070407040702000203 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hey Stephen, thanks for the advice, but as i wrote in my first post, i wanted to do that anyways. But thanks for the explanation why this is indeed the best way to go for a production system ...

Cheers
Wolli

Am 27.03.14 16:05, schrieb Stephen Sprague:
fwiw. i would not have the repair table statement as part of a production job stream.  That's kinda a poor man's way to employ dynamic partitioning off the back end. 

Why not either use hive's dynamic partitioning features or pre-declare your partitions? that way you are explicitly coding for your purpose rather than running a general repair table on the backend knowing you "broke it" up front?

just a suggestion!


On Thu, Mar 27, 2014 at 3:18 AM, fab wol <darkwolli32@gmail.com> wrote:
Hey Nitin and everyone else,

so let me tell you from memory that the Hive CLI Error was kind of the same and nothing saying like the beeline error. Would have been no uplift here.

I was restarting the cluster (it is a cloud cluster provided by http://www.unbelievable-machine.net), for getting the HiveServer2 Log and to be very sure, that everything is well set up. During this all tasktrackers are deleted and newly setup (HDFS and storage is not touched at all, neither are configs). After that the msck repair table stmt is going well and its actually not so slow at all, as i thought it might be (ca. 110 secs per table). I guess there might have been some logs/tmp/cache data stacked up, and that might have caused the errors ...

Slightly confusing, but i will post if I find out what exactly was throwing the error here in the future ...

Cheers for the help
Wolli


2014-03-27 11:03 GMT+01:00 Nitin Pawar <nitinpawar432@gmail.com>:

Without error stack, very hard to get whats wrong 

will it be possible for you to run it via hive cli and grab some logs through there ? 


On Thu, Mar 27, 2014 at 3:29 PM, fab wol <darkwolli32@gmail.com> wrote:
Hey Nitin,

HiveServer2 Log unfurtantely says nothing:

Mon Mar 24 17:41:18 CET 2014 hiveserver2 stopped, pid 2540
Mon Mar 24 17:43:22 CET 2014 hiveserver2 started, pid 2554
Hive history file=/tmp/mapr/hive_job_log_97715747-63cd-4789-9b2e-a8b0d544cdf9_2102956370.txt
OK
Thu Mar 27 10:52:48 CET 2014 hiveserver2 stopped, pid 2554
Thu Mar 27 10:55:52 CET 2014 hiveserver2 started, pid 2597

Cheers
Wolli


2014-03-27 10:04 GMT+01:00 Nitin Pawar <nitinpawar432@gmail.com>:

can you grab more logs from hiveserver2 log file? 


On Thu, Mar 27, 2014 at 2:31 PM, fab wol <darkwolli32@gmail.com> wrote:
Hey everyone,

I have a table with currently 5541 partitions. Daily there are 14 partitions added. I will switch the update for the metastore from "msck repair table" to "alter table add partition", since its performing better, but sometimes this might fail, and i need the "msck repair table" command. But unfortunately its not working anymore with this table size it seems:

0: jdbc:hive2://clusterXYZ-> use <DB_NAME>;
No rows affected (1.082 seconds)
0: jdbc:hive2://clusterXYZ-> set hive.metastore.client.socket.timeout=6000;
No rows affected (0.029 seconds)
0: jdbc:hive2://clusterXYZ-> MSCK REPAIR TABLE <TABLENAME>;
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)

anyone had luck with getting this to work? As you can see, I already raised the time until the Thrift Timeout kicks in, but this error is happening even before the time runs off ...

Cheers
Wolli



--
Nitin Pawar




--
Nitin Pawar



--------------010309070407040702000203--