Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 72113 invoked from network); 25 Jan 2008 18:45:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 25 Jan 2008 18:45:20 -0000 Received: (qmail 47510 invoked by uid 500); 25 Jan 2008 18:45:10 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 46785 invoked by uid 500); 25 Jan 2008 18:45:08 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 46776 invoked by uid 99); 25 Jan 2008 18:45:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jan 2008 10:45:08 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [63.203.238.117] (HELO dns.duboce.net) (63.203.238.117) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jan 2008 18:44:55 +0000 Received: by dns.duboce.net (Postfix, from userid 1008) id 5F283C51B; Fri, 25 Jan 2008 09:11:35 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-26) on dns.duboce.net X-Spam-Level: Received: from durruti.desk.hq.powerset.com (durruti.desk.hq.powerset.com [208.84.6.136]) by dns.duboce.net (Postfix) with ESMTP id EF92CC1CE for ; Fri, 25 Jan 2008 09:11:27 -0800 (PST) Message-ID: <479A2D91.60101@duboce.net> Date: Fri, 25 Jan 2008 10:42:25 -0800 From: stack User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: core-user@hadoop.apache.org Subject: Re: Region offline issues References: <1201206231.16299.36.camel@mharris1.jumptap.com> <34506233-C375-4D48-8CFE-897DE406847B@rapleaf.com> <479904A2.3080007@duboce.net> <1201278063.16299.72.camel@mharris1.jumptap.com> In-Reply-To: <1201278063.16299.72.camel@mharris1.jumptap.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=-3.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.4 Marc Harris wrote: > To Byan's points: > ... > 2) There does not appear to be anything else significant in the logs. I > can send them to you if you like but I think my previous comment may > cause you to be less interested. > > Send them to me if you don't mind. I'd look at them to see what was going on in the regionserver such that the client couldn't get a update in during a run of all the retries (I'd guess it to do with HADOOP-2712 and HADOOP-2615). > 3) About success running on a 13 node cluster. I think that's really the > question. Should I expect this data load to work reasonably well on a > single node cluster or not? > I don't know about 'reasonably well'. Single-node is sub-optimal but it should be possible to load it w/ a decent amount of data w/o failures. > To stack's points: > > 4) Could you explain what you mean by "forever to load"? During the > phases it was working I would get about 100 rows per second, which was > sufficient for me. Also could you explain why setting up a mapreduce job > would make things more efficient in a single server setup? Are things > not limited by disk access either way? > Pardon me. I presumed multiple cores and was suggesting MR as one means of putting up multiple concurrent upload clients. Yeah, disk is a bottleneck. > 5) When a regionserver judges itself overloaded and blocks updates, can > another regionserver take up the load for all susequent updates, or do > certain updates (based on row key presumably) have to go to that > regionserver? > The latter. St.Ack