Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 77F4CDCF5 for ; Mon, 2 Jul 2012 18:59:10 +0000 (UTC) Received: (qmail 67716 invoked by uid 500); 2 Jul 2012 18:59:08 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 67665 invoked by uid 500); 2 Jul 2012 18:59:08 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 67655 invoked by uid 99); 2 Jul 2012 18:59:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jul 2012 18:59:08 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of fundatureanu.sever@gmail.com designates 209.85.216.169 as permitted sender) Received: from [209.85.216.169] (HELO mail-qc0-f169.google.com) (209.85.216.169) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jul 2012 18:59:03 +0000 Received: by qcsd16 with SMTP id d16so3527414qcs.14 for ; Mon, 02 Jul 2012 11:58:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=4q3B3ZFPi8oElXZRysZ7fODLsmhthIeCbzYuuBKldj4=; b=A1KU/tp3PxeesbdwcrUdQV0/KAMw9qv1+lumT/+Q4PLTr5hAZDC8T/ZRRFEdHuoXOJ hIELluZ0+Ul6IZDbI3WbX7V1bDS40+dgmWFgRVZ2F2sarngyagEVBZ0QnzkBdIilE+P6 c+wTf/8ehCQRf4ZituJRFboI4x7710JRsXtQEVVr4kbJFryXWws95tdoV+eC3HswVkUK lYLKfRnsVH0UMKMclab++5OlhY8oA1NV9our1C1696NejT9aKwBcVNGVBgCNiWXSKS6S RhwLuu0hbbeASExh0nQfPn/lgT0WIjCfQQ16xyZRDqFciAuEef0cBZHmKkh+ec9aUrGq 4CSg== MIME-Version: 1.0 Received: by 10.224.182.136 with SMTP id cc8mr866482qab.60.1341255522045; Mon, 02 Jul 2012 11:58:42 -0700 (PDT) Received: by 10.229.48.137 with HTTP; Mon, 2 Jul 2012 11:58:42 -0700 (PDT) In-Reply-To: References: Date: Mon, 2 Jul 2012 20:58:42 +0200 Message-ID: Subject: Re: HBase bulk load through co-processors From: Sever Fundatureanu To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=20cf30363d3ff1cdf404c3dd62e9 X-Virus-Checked: Checked by ClamAV on apache.org --20cf30363d3ff1cdf404c3dd62e9 Content-Type: text/plain; charset=ISO-8859-1 I agree that increasing the timeout is not the best option, I will work both on better balancing the load and maybe doing it in increments like you suggested. However for now I want a quick fix to the problem. Just to see if I understand this right: a zookeeper node redirects my client to a region server node and then my client talk directly to this region server; now the timeout happens on the client while talking to the RS right? It expects some kind of confirmation and it times out.. if this is the case how can I increase this timeout? I only found in the documentation "zookeeper.session.timeout" which is the timeout between zookeeper and HBase. Thanks, Sever On Mon, Jul 2, 2012 at 8:19 PM, Jean-Marc Spaggiari wrote: > Hi Sever, > > It seems one of the nodes in your cluster is overwhelmed with the load > you are giving him. > > So IMO, you have two options here: > First, you can try to reduce the load. I mean, split the bulk in > multiple smaller bulks and load them one by one to give the time to > your cluster to dispatch it correctly. > Second, you can inscreade the timeone from 60s to 120s. But you might > face the same issue with 120s so I really recommand the fist option. > > JM > > 2012/7/2, Sever Fundatureanu : > > Can someone please help me with this? > > > > Thanks, > > Sever > > > > On Tue, Jun 26, 2012 at 8:14 PM, Sever Fundatureanu < > > fundatureanu.sever@gmail.com> wrote: > > > >> My keys are built of 4 8-byte Ids. I am currently doing the load with > MR > >> but I get a timeout when doing the loadIncrementalFiles call: > >> > >> 12/06/24 21:29:01 ERROR mapreduce.LoadIncrementalHFiles: Encountered > >> unrecoverable error from region server > >> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after > >> attempts=10, exceptions: > >> Sun Jun 24 21:29:01 CEST 2012, > >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3@4699ecf9, > >> java.net.SocketTimeoutException: Call to das3002.cm.cluster/ > >> 10.141.0.79:60020 > >> failed on socket timeout exception: java.net.SocketTimeoutException: > >> 60000 > >> millis timeout while waiting for channel to be ready for read. ch : > >> java.nio.channels.SocketChannel[co > >> nnected local=/10.141.0.254:43240 remote=das3002.cm.cluster/ > >> 10.141.0.79:60020] > >> > >> at > >> > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1345) > >> at > >> > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.tryAtomicRegionLoad(LoadIncrementalHFiles.java:487) > >> at > >> > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:275) > >> at > >> > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:273) > >> at > >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) > >> at > >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > >> at > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > >> at java.lang.Thread.run(Thread.java:662) > >> 12/06/24 21:30:52 ERROR mapreduce.LoadIncrementalHFiles: Encountered > >> unrecoverable error from region server > >> > >> Is there a way in which I can increase the timeout period? > >> > >> Thank you, > >> > >> On Tue, Jun 26, 2012 at 7:05 PM, Andrew Purtell > >> wrote: > >> > >>> On Tue, Jun 26, 2012 at 9:56 AM, Sever Fundatureanu > >>> wrote: > >>> > I have to bulkload 6 tables which contain the same information but > >>> > with > >>> a > >>> > different order to cover all possible access patterns. Would it be a > >>> good > >>> > idea to do only one load and use co-processors to populate the other > >>> > tables, instead of doing the traditional MR bulkload which would > >>> require 6 > >>> > separate jobs? > >>> > >>> Without knowing more than you've said, it seems better to use MR to > >>> build all input. > >>> > >>> Best regards, > >>> > >>> - Andy > >>> > >>> Problems worthy of attack prove their worth by hitting back. - Piet > >>> Hein (via Tom White) > >>> > >> > >> > >> > >> -- > >> Sever Fundatureanu > >> > >> Vrije Universiteit Amsterdam > >> E-mail: fundatureanu.sever@gmail.com > >> > > > > > > > > -- > > Sever Fundatureanu > > > > Vrije Universiteit Amsterdam > > E-mail: fundatureanu.sever@gmail.com > > > -- Sever Fundatureanu Vrije Universiteit Amsterdam E-mail: fundatureanu.sever@gmail.com --20cf30363d3ff1cdf404c3dd62e9--