Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4A08E931 for ; Mon, 18 Jun 2012 22:17:31 +0000 (UTC) Received: (qmail 85871 invoked by uid 500); 18 Jun 2012 22:17:29 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 85812 invoked by uid 500); 18 Jun 2012 22:17:29 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 85803 invoked by uid 99); 18 Jun 2012 22:17:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jun 2012 22:17:29 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jdcryans@gmail.com designates 209.85.213.169 as permitted sender) Received: from [209.85.213.169] (HELO mail-yx0-f169.google.com) (209.85.213.169) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jun 2012 22:17:25 +0000 Received: by yenr5 with SMTP id r5so4612128yen.14 for ; Mon, 18 Jun 2012 15:17:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Z9DrIQG0xYaEGDuD6Y8DOkKG5Xnk1+gE1A2fo3zWP2Q=; b=MUHc4P4xaHfxZFqT4A3SfJBkLI30QVz0N9xYUAcRRj8KFjV26crAZkXYYZw/gZHHVp rgiMojjBGiPHDMqBw0TqKuaMvP4SVwXoeLdTdRywef8eKXD7MKeS45n+lJO95AxRthGi yeJhFvryBYxmkediv6oB+4wFGtTwFael26DwPJYCKJ+MYLH8j5RyEbEbldbb0P0tg/Jm Pdqqh0i4hEJYGxZl2qSpLKADF3Vd/TI24QCgLHEPlIXs7/0wMhSAYwRvtVSOzlg/l6x0 +PIsH88jW4oZ/CvRA8tu8zBy4MFOo1LP98pqrrqSba0E6c/mmkxg6csQY/eAqV50bqMl dlOg== MIME-Version: 1.0 Received: by 10.236.136.8 with SMTP id v8mr20463299yhi.101.1340057824741; Mon, 18 Jun 2012 15:17:04 -0700 (PDT) Sender: jdcryans@gmail.com Received: by 10.101.188.9 with HTTP; Mon, 18 Jun 2012 15:17:04 -0700 (PDT) In-Reply-To: References: Date: Mon, 18 Jun 2012 15:17:04 -0700 X-Google-Sender-Auth: LZ7dE3F15zP02Vkq-dvaxU6Ej88 Message-ID: Subject: Re: RS unresponsive after series of deletes From: Jean-Daniel Cryans To: user@hbase.apache.org Cc: Development Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Mass deleting in HBase is equivalent to mass inserting, it's just that the former doesn't have to write values out (just keys). Almost everything that applies to batch insert tunings applies to batch deleting. Now the error you get comes from this: https://issues.apache.org/jira/browse/HBASE-5190 What it means is that you have 1GB worth of _deletes_ sitting the region server call queue. That's way too much, something's wrong, and it doesn't seem to be making progress. Like Stack said in his reply, have you thread dumped the slow region servers when this happens? It would also help to see the log during that time. Try to capture a good chunk of it and post it like you did on pastebin. Thx, J-D On Mon, Jun 18, 2012 at 3:08 PM, Ted Tuttle w= rote: > We had another of these delete-related RS hang ups. =A0This time we are > getting a different error on the client: > > java.io.IOException: Call queue is full, is > ipc.server.max.callqueue.size too small? > > full stack here: http://pastebin.com/uq68Mvhm > > Looking at the RS log, it appears the RS was working on the batch delete > for about 1hr. There are no errors in the RS log during this time. > There are several "responseTooSlow" messages. =A0Based on processingtimem= s > values they all lead back to our big batch delete. > > Any theories on how a big batch of deletes could cause a RS to go > unresponsive? > > >