Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4189A200C81 for ; Fri, 26 May 2017 20:04:10 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 3E491160BC7; Fri, 26 May 2017 18:04:10 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5CC00160B9C for ; Fri, 26 May 2017 20:04:09 +0200 (CEST) Received: (qmail 94652 invoked by uid 500); 26 May 2017 18:04:08 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 94619 invoked by uid 99); 26 May 2017 18:04:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 May 2017 18:04:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 8B7F31AF9FE; Fri, 26 May 2017 18:04:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.066 X-Spam-Level: *** X-Spam-Status: No, score=3.066 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FORGED_HOTMAIL_RCVD2=1.187, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=hotmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Eq6p95-cvFIO; Fri, 26 May 2017 18:04:05 +0000 (UTC) Received: from NAM04-SN1-obe.outbound.protection.outlook.com (mail-oln040092011065.outbound.protection.outlook.com [40.92.11.65]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 8666A60CFC; Fri, 26 May 2017 18:04:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=eHHLLmCZMCSyfDKqVKIMD7EJkN+GN+A4K5KZv9Uvdhs=; b=jx2bG3IaauFn21U0rYlnZsAmMzWJdqtvZUEDggXDFUHKO8EgG/o+yeAEOuqJcVCJU+zROKSRuBHXwt+WBK2gpJhKtVZAm0jt2/heVjzIIDU7AFIP9ZthUm5hNmFCxSIpu5YC79lSPntvKvp+/AH08Cy+oVKfwUEEBlxd3bbLlyLCN6TjYbIIzvwaE94o4EYzQ7TY14Z9PFUEUVYROujQKmIE5/ju0N7dMH292KmZuGe2iptMfoUFZOrOTt0zhOjzPoQyiSPRcauI4b/zOW3Hl3u0IwVKSBfwyK/gMpo40n/1jXtHw1JtHUywGjTF6NsDjKJ+Eu5EgJTkClLlzvh6yA== Received: from CO1NAM04FT062.eop-NAM04.prod.protection.outlook.com (10.152.90.58) by CO1NAM04HT017.eop-NAM04.prod.protection.outlook.com (10.152.91.78) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.1101.12; Fri, 26 May 2017 18:03:55 +0000 Received: from MWHPR14MB1293.namprd14.prod.outlook.com (10.152.90.60) by CO1NAM04FT062.mail.protection.outlook.com (10.152.91.165) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1101.12 via Frontend Transport; Fri, 26 May 2017 18:03:55 +0000 Received: from MWHPR14MB1293.namprd14.prod.outlook.com ([10.173.102.19]) by MWHPR14MB1293.namprd14.prod.outlook.com ([10.173.102.19]) with mapi id 15.01.1124.013; Fri, 26 May 2017 18:03:55 +0000 From: jeff saremi To: hbase-user , "dev@hbase.apache.org" Subject: Re: What is Dead Region Servers and how to clear them up? Thread-Topic: What is Dead Region Servers and how to clear them up? Thread-Index: AQHS1KdN5y0CjJuNS0au+NbkeUqhZaID2KezgAAC3ACAAAKQpoABb6LNgAGIdESAAAL+AIAAEiX1 Date: Fri, 26 May 2017 18:03:55 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: hbase.apache.org; dkim=none (message not signed) header.d=none;hbase.apache.org; dmarc=none action=none header.from=hotmail.com; x-incomingtopheadermarker: OriginalChecksum:666EF949A8EF70DF4AA16EDE38551A04098CB4D6D070AA0EACC3C8C1E2A27504;UpperCasedChecksum:31F1D380F1D792CC65FA63B4ECAB3058216CBA6464CDEC30796D4A45B7945254;SizeAsReceived:8832;Count:45 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [mpcZ4h0LVePnQjq34VG0aaodGJL6EOUZB54fx3NslULf4ayBA8WXsg==] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;CO1NAM04HT017;24:gds4yM0pcxEblgiQ6ry5h/wunfmUyL844JJcDD4a1s6guUFFV2RyAO+OK6jKacDncH4Mm9hYXhHi6HSRAbC5qvLeT2TW6wtLAD5+FlcRp78=;7:OwkzENSMGY8li8szrwyS9GcwEaZOYWFkO04nudR9ugdwq5eeF2gBwCG/mhUweymZgJdliEgNLX48ePOjs8BD3jSNRuqWA8+ICKrdRj8ClRRocitqGLfhx7Gjk4mjQJdMqPE9t3vjOVEtb7zgWTfOD/+W11KeAgCz8Qx8h85MFqlRIED//SAATjvUWPdpUN87pfrofYvEArzvjKtsKpVKz4XHkhi947LDFMjz/XIXHAd9ey6k6+uJfMrNZcxkjpRrfEkpPtvEbDFmjTmmIp9kruHBP164cd6qSFzHDew5LWUzgTUgs+uqV8ygliQq8Lqb x-incomingheadercount: 45 x-eopattributedmessage: 0 x-forefront-antispam-report: EFV:NLI;SFV:NSPM;SFS:(7070007)(98901004);DIR:OUT;SFP:1901;SCL:1;SRVR:CO1NAM04HT017;H:MWHPR14MB1293.namprd14.prod.outlook.com;FPR:;SPF:None;LANG:en; x-ms-traffictypediagnostic: CO1NAM04HT017: x-ms-office365-filtering-correlation-id: 92e288f8-8294-4ad7-71f0-08d4a461937e x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(201702061074)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031324274)(2017031323274)(2017031322274)(1603101448)(1601125374)(1701031045);SRVR:CO1NAM04HT017; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700061)(100105000095)(100000701061)(100105300095)(100000702061)(100105100095)(444000031);SRVR:CO1NAM04HT017;BCL:0;PCL:0;RULEID:(100000800061)(100110000095)(100000801061)(100110300095)(100000802061)(100110100095)(100000803061)(100110400095)(100000804061)(100110200095)(100000805054)(100110500095);SRVR:CO1NAM04HT017; x-forefront-prvs: 031996B7EF spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: multipart/alternative; boundary="_000_MWHPR14MB12932EAC4CCA7FB940058160C1FC0MWHPR14MB1293namp_" MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-originalarrivaltime: 26 May 2017 18:03:55.9032 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1NAM04HT017 archived-at: Fri, 26 May 2017 18:04:10 -0000 --_000_MWHPR14MB12932EAC4CCA7FB940058160C1FC0MWHPR14MB1293namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Thank you for the GFY answer And i guess to figure out how to fix these I can always go through the HBas= e source code. ________________________________ From: Dima Spivak Sent: Friday, May 26, 2017 9:58:00 AM To: hbase-user Subject: Re: What is Dead Region Servers and how to clear them up? Sending this back to the user mailing list. RegionServers can die for many reasons. Looking at your RegionServer log files should give hints as to why it's happening. -Dima On Fri, May 26, 2017 at 9:48 AM, jeff saremi wrote= : > I had posted this to the user mailing list and I have not got any direct > answer to my question. > > Where do dead RS's come from and how can they be cleaned up? Someone in > the midst of developers should know this. > > thanks > > Jeff > > ________________________________ > From: jeff saremi > Sent: Thursday, May 25, 2017 10:23:17 AM > To: user@hbase.apache.org > Subject: Re: What is Dead Region Servers and how to clear them up? > > I'm still looking to get hints on how to remove the dead regions. thanks > > ________________________________ > From: jeff saremi > Sent: Wednesday, May 24, 2017 12:27:06 PM > To: user@hbase.apache.org > Subject: Re: What is Dead Region Servers and how to clear them up? > > i'm trying to eliminate the dead region servers. > > ________________________________ > From: Ted Yu > Sent: Wednesday, May 24, 2017 12:17:40 PM > To: user@hbase.apache.org > Subject: Re: What is Dead Region Servers and how to clear them up? > > bq. running hbck (many times > > Can you describe the specific inconsistencies you were trying to resolve = ? > Depending on the inconsistencies, advice can be given on the best known > hbck command arguments to use. > > Feel free to pastebin master log if needed. > > On Wed, May 24, 2017 at 12:10 PM, jeff saremi > wrote: > > > these are the things I have done so far: > > > > > > - restarting master (few times) > > > > - running hbck (many times; this tool does not seem to be doing anythin= g > > at all) > > > > - checking the list of region servers in ZK (none of the dead ones are > > listed here) > > > > - checking the WALs under /WALs. Out of 11 dead ones only 3 > > are listed here with "-splitting" at the end of their names and they > > contain one single file like: 1493846660401..meta.1493922323600.meta > > > > > > > > > > ________________________________ > > From: jeff saremi > > Sent: Wednesday, May 24, 2017 9:04:11 AM > > To: user@hbase.apache.org > > Subject: What is Dead Region Servers and how to clear them up? > > > > Apparently having dead region servers is so common that a section of th= e > > master console is dedicated to that? > > How can we clean this up (preferably in an automated fashion)? Why isn'= t > > this being done by HBase automatically? > > > > > > thanks > > > --_000_MWHPR14MB12932EAC4CCA7FB940058160C1FC0MWHPR14MB1293namp_--