Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 011FF1920B for ; Tue, 12 Apr 2016 21:49:21 +0000 (UTC) Received: (qmail 43477 invoked by uid 500); 12 Apr 2016 21:49:20 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 43402 invoked by uid 500); 12 Apr 2016 21:49:20 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 43390 invoked by uid 99); 12 Apr 2016 21:49:20 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Apr 2016 21:49:20 +0000 Received: from mail-vk0-f54.google.com (mail-vk0-f54.google.com [209.85.213.54]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 0B9BC1A0113 for ; Tue, 12 Apr 2016 21:49:19 +0000 (UTC) Received: by mail-vk0-f54.google.com with SMTP id c4so44039046vkb.3 for ; Tue, 12 Apr 2016 14:49:19 -0700 (PDT) X-Gm-Message-State: AOPr4FUbrLp6IlM4+i+wMZXZH7+3D2vvaA8GvqVLGjaiCvyE0Cm388NEam6t0sjHjjWJjICEFSkuulChYfKyUQ== X-Received: by 10.159.37.101 with SMTP id 92mr2961362uaz.66.1460497759240; Tue, 12 Apr 2016 14:49:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.31.232.3 with HTTP; Tue, 12 Apr 2016 14:48:59 -0700 (PDT) In-Reply-To: References: From: Ashutosh Chauhan Date: Tue, 12 Apr 2016 14:48:59 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: recent metastore failures in HiveQA To: "dev@hive.apache.org" Content-Type: multipart/alternative; boundary=94eb2c123f1c228f19053050a2bf --94eb2c123f1c228f19053050a2bf Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I can repro locally hang of TestJdbcWithMiniHS2 and I also saw it hanging on recent QA runs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK= -Build-7556/failed/TestJdbcWithMiniHS2/ Wondering if you guys have seen this? Thanks, Ashutosh On Tue, Apr 12, 2016 at 11:28 AM, Szehon Ho wrote: > Thanks Thejas for this patch! > > I'm also going to restart PTest and force recreation of the test slaves > based on fresh image to see if it resolves the issue (in case the test > slaves are getting too loaded and slow to start HMS in time). If not the= n > Thejas's patch should tell us a bit more. > > On Tue, Apr 12, 2016 at 12:35 AM, Thejas Nair > wrote: > > > Created a patch that should hopefully help in figuring out whats going > > on - https://issues.apache.org/jira/browse/HIVE-13491 > > > > > > On Wed, Apr 6, 2016 at 1:56 PM, Szehon Ho wrote: > > > Yea thanks for point it out. I see it too and am not able to reprodu= ce > > it > > > locally. It points to an environment issue, but not aware anything > > changed > > > with the environment. > > > > > > Anyone have any ideas? > > > > > > On Wed, Apr 6, 2016 at 1:29 PM, Sergey Shelukhin < > sergey@hortonworks.com > > > > > > wrote: > > > > > >> Has anyone else noticed that many tests that involve metastore start= ed > > >> failing lately? The failures are sporadic and happen both in the tes= ts > > >> that test metastore, and q files that use metastore=E2=80=A6 > > >> The error is always something like > > >> java.net.ConnectException: Connection refused > > >> at java.net.PlainSocketImpl.socketConnect(Native Method) > > >> at > > >> > > > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:3= 39 > > >> ) > > >> at > > >> > > > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl= .j > > >> ava:198) > > >> at > > >> > > > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182= ) > > >> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392= ) > > >> at java.net.Socket.connect(Socket.java:579) > > >> at > > >> > > > org.apache.hadoop.hive.metastore.MetaStoreUtils.loopUntilHMSReady(MetaSto= re > > >> Utils.java:1208) > > >> at > > >> > > > org.apache.hadoop.hive.metastore.MetaStoreUtils.startMetaStore(MetaStoreU= ti > > >> ls.java:1195) > > >> at > > >> > > > org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.before(TestMetaStor= eM > > >> etrics.java:54) > > >> > > >> I wonder if someone has insight on whether this is an environment > issue, > > >> or someone broke something recently, before we investigate more :) > > >> > > >> > > >> > > > --94eb2c123f1c228f19053050a2bf--