Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D4B0595CD for ; Thu, 14 Mar 2013 01:13:10 +0000 (UTC) Received: (qmail 76728 invoked by uid 500); 14 Mar 2013 01:13:09 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 76638 invoked by uid 500); 14 Mar 2013 01:13:09 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 76626 invoked by uid 99); 14 Mar 2013 01:13:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 01:13:09 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [72.30.239.12] (HELO nm31-vm4.bullet.mail.bf1.yahoo.com) (72.30.239.12) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 01:13:02 +0000 Received: from [98.139.212.148] by nm31.bullet.mail.bf1.yahoo.com with NNFMP; 14 Mar 2013 01:12:41 -0000 Received: from [98.139.212.241] by tm5.bullet.mail.bf1.yahoo.com with NNFMP; 14 Mar 2013 01:12:41 -0000 Received: from [127.0.0.1] by omp1050.mail.bf1.yahoo.com with NNFMP; 14 Mar 2013 01:12:41 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 432580.68570.bm@omp1050.mail.bf1.yahoo.com Received: (qmail 50648 invoked by uid 60001); 14 Mar 2013 01:12:41 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1363223561; bh=4LlGi0E2eSx+6/u0lGcPHk1mBzFYGTYKBpif2HoLFxs=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-RocketYMMF:X-Mailer:Message-ID:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=oxt7sSStW7E8CIGj8lTiTZPy3F9Ba73o/uM0vEhVQJ+KtwlH6f0JYbFBCFru9tNvT2ZuPNlXzTKQlNnpid2CPs/RfSmO16shiZyYI++22FLG9adeX4XJEsJTKtaQlnMnZ+cdWz3rgEj4bPKgx/QFi6vJD6Zmt4P1Ks4U2ZfRL+U= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-RocketYMMF:X-Mailer:Message-ID:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=f5NLJ94XuJPheFpX+M5Vk4ILSKC9lxoXdJ9mAyroODxtuoxBgpAftbEhEepJNFENLL3Hl9gKbuF6/CvSiOyOVEJ9GBxcCr/jDkeRhy+tK12lG3KaWHtRgqB/0VXQJijfRpGUUIpykkD4Ee4gRqlMGs8bScRckOLdre1T0xzEybI=; X-YMail-OSG: DYGtcHkVM1mRt8Abd5wH3gL12lfohrtnJt7xIA7xlfH14hd qVmssWwl99wKe3vzroLtlh4Bl3zgsRrr34tLhmoxjnz6f0iUMd5DcSwZ4ooy WROYM9Uaw6SSN0Hr0h2o5EVeUjku_plIwMEZIiN0pbn9ibjecSstEKo69OT0 mgrdzbYE_qm8f0cCiGtU8d91lc_lilcA7n3_KzFgzN1Uqj_fIL1CW19m905e gtqvg3X5TCPAROcflpfG3OU2.nN.qOStluyJ0FZhJAxbjs_CJ5DPK.DHajk3 7is_ioDqMClzhwiiS7hg6bFZcZbAcf2r1Rqa8TCZrmmAHaW_0BjI2pXNXdYN CSwZ.Po.F4Qc657.RDuYlWzojtzP_W6nUfq3CEdYc9GWl2OAaIt7XHrELCrZ GOw7aNCqTjKZ52JXN8AE30fbPvuX1ii8srlqR4hR7wrwNuqCr7Z2lX5FuOrC 75ynTYODkcWxE5Nsq9j4lqSohyLaTU8yHJ9pmg3w5sw1l7Eqb03bJtWNpdVs qG.Uq1h7J2KFqh9Jx8kL6aKJy4uY7dmos6zYG4w9WR3gHUClqz_K7z4q2lrh IDrkk.w-- Received: from [204.14.239.221] by web140606.mail.bf1.yahoo.com via HTTP; Wed, 13 Mar 2013 18:12:41 PDT X-Rocket-MIMEInfo: 002.001,V2UganVzdCByYW4gaW50byBhbiBpbnRlcmVzdGluZyBzY2VuYXJpby4gV2UgcmVzdGFydGVkIGEgY2x1c3RlciB0aGF0IHdhcyBzZXR1cCBhcyBhIHJlcGxpY2F0aW9uIHNvdXJjZS4KVGhlIHN0b3Agd2VudCBjbGVhbmx5LgoKVXBvbiByZXN0YXJ0ICphbGwqIHJlZ2lvbnNlcnZlcnMgYWJvcnRlZCB3aXRoaW4gYSBmZXcgc2Vjb25kcyB3aXRoIHZhcmlhdGlvbnMgb2YgdGhlc2UgZXJyb3JzOgpodHRwOi8vcGFzdGViaW4uY29tLzNpUVZ1QnFTCgpUaGlzIGlzIHNjYXJ5IQoKLS0gTGFycwEwAQEBAQ-- X-RocketYMMF: lhofhansl X-Mailer: YahooMailWebService/0.8.137.519 Message-ID: <1363223561.19602.YahooMailNeo@web140606.mail.bf1.yahoo.com> Date: Wed, 13 Mar 2013 18:12:41 -0700 (PDT) From: lars hofhansl Reply-To: lars hofhansl Subject: Replication hosed after simple cluster restart To: hbase-dev MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="1905101558-1845065894-1363223561=:19602" X-Virus-Checked: Checked by ClamAV on apache.org --1905101558-1845065894-1363223561=:19602 Content-Type: text/plain; charset=us-ascii We just ran into an interesting scenario. We restarted a cluster that was setup as a replication source. The stop went cleanly. Upon restart *all* regionservers aborted within a few seconds with variations of these errors: http://pastebin.com/3iQVuBqS This is scary! -- Lars --1905101558-1845065894-1363223561=:19602--