Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 321C4955F for ; Sat, 31 Mar 2012 01:06:19 +0000 (UTC) Received: (qmail 25825 invoked by uid 500); 31 Mar 2012 01:06:17 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 25779 invoked by uid 500); 31 Mar 2012 01:06:17 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 25770 invoked by uid 99); 31 Mar 2012 01:06:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 31 Mar 2012 01:06:17 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kevin.odell@cloudera.com designates 209.85.214.169 as permitted sender) Received: from [209.85.214.169] (HELO mail-ob0-f169.google.com) (209.85.214.169) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 31 Mar 2012 01:06:11 +0000 Received: by obbta14 with SMTP id ta14so2163056obb.14 for ; Fri, 30 Mar 2012 18:05:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=rJ3zhd3ilrRo5MH8iNjx7zlilsfInt5kJcL6iAIngwQ=; b=plEbbLsxZQ6VWTgQmuxoC9tC8LWTi++JfSnAr+Yj4sXiLFioWZU5fXTCSvguDgsY+8 LTH1UnqqQ69yJ+ZkjdLr8iFpUcAZMWrjVXAGozQKM2oHtsb8Yqn/AwoXViJcz1GGvYCw eBLws7tTV2QhQbyjf9rBIPlySu45zsbW/PXk+xjwnkRTmEVIbeXoGk+43UE2hr+zRXrL Rg1VTrVXzqUgwPW50+9dSUVpI0B2Ch8It8SOVJMDVCvYc7CRb5zLSLMnBDydSQx4658Y LHF2WRFgcKTiJKf/Xck2Fr8h4qUinFa2zVf7HiF9C5jhdF0g4pSt1iegY1VsOfGoL3Is Q82A== MIME-Version: 1.0 Received: by 10.60.20.38 with SMTP id k6mr669628oee.26.1333155950115; Fri, 30 Mar 2012 18:05:50 -0700 (PDT) Received: by 10.182.91.165 with HTTP; Fri, 30 Mar 2012 18:05:50 -0700 (PDT) In-Reply-To: References: Date: Fri, 30 Mar 2012 21:05:50 -0400 Message-ID: Subject: Re: Bulk loading job failed when one region server went down in the cluster From: "Kevin O'dell" To: user@hbase.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQmYbVGy5YKOOazpYVfAgecWzEKvZ7YVKgNHBPahlymbapygxxw2pq+UzLd2kgUKbG1B+5A+ Anil, Can you please attach the RS logs from the failure? On Fri, Mar 30, 2012 at 7:05 PM, anil gupta wrote: > Hi All, > > I am using cdh3u2 and i have 7 worker nodes(VM's spread across two > machines) which are running Datanode, Tasktracker, and Region Server(1200 > MB heap size). I was loading data into HBase using Bulk Loader with a > custom mapper. I was loading around 34 million records and I have loaded > the same set of data in the same environment many times before without any > problem. This time while loading the data, one of the region server(but the > DN and TT kept on running on that node ) failed and then after numerous > failures of map-tasks the loding job failed. Is there any > setting/configuration which can make Bulk Loading fault-tolerant to failure > of region-servers? > > -- > Thanks & Regards, > Anil Gupta -- Kevin O'Dell Customer Operations Engineer, Cloudera