Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 388319CED for ; Sun, 9 Oct 2011 15:44:47 +0000 (UTC) Received: (qmail 61852 invoked by uid 500); 9 Oct 2011 15:44:45 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 61817 invoked by uid 500); 9 Oct 2011 15:44:45 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 61809 invoked by uid 99); 9 Oct 2011 15:44:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 09 Oct 2011 15:44:45 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yuzhihong@gmail.com designates 74.125.82.41 as permitted sender) Received: from [74.125.82.41] (HELO mail-ww0-f41.google.com) (74.125.82.41) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 09 Oct 2011 15:44:40 +0000 Received: by wwf10 with SMTP id 10so2602625wwf.2 for ; Sun, 09 Oct 2011 08:44:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Cdzd1G92pWSDp8BJ2XVLCRoebR9MgsgpzaJLyxt/8TU=; b=hpBpJm2ijKLA4ofXz81hbSMFbC4GklO7pas5WqTukCghqUBVQXTGs5B6V1+OYoPfyV +PY0sgh2FiH0TJ9qxI2Ma/bdsV9BltDpgpOnTvgYiR8ITYtThKORb2d+LRTSbu+mXnfp W8pkee+gGvMFDzPnl0eJeok5HBImuMFWPtza8= MIME-Version: 1.0 Received: by 10.216.137.223 with SMTP id y73mr4977587wei.6.1318175057484; Sun, 09 Oct 2011 08:44:17 -0700 (PDT) Received: by 10.216.17.208 with HTTP; Sun, 9 Oct 2011 08:44:17 -0700 (PDT) In-Reply-To: References: Date: Sun, 9 Oct 2011 08:44:17 -0700 Message-ID: Subject: Re: speeding up rowcount From: Ted Yu To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=0016e6d7e9210d99f104aedf8c00 --0016e6d7e9210d99f104aedf8c00 Content-Type: text/plain; charset=ISO-8859-1 Excellent question. There seems to be a bug for RowCounter. In TableInputFormat: if (conf.get(SCAN_CACHEDROWS) != null) { scan.setCaching(Integer.parseInt(conf.get(SCAN_CACHEDROWS))); } But I don't see SCAN_CACHEDROWS in either TableMapReduceUtil or RowCounter. Mind filing a bug ? On Sun, Oct 9, 2011 at 8:30 AM, Rita wrote: > Thanks for the responses. > > Where do I set the high Scan cache values? > > > On Sun, Oct 9, 2011 at 11:19 AM, Himanshu Vashishtha < > hvashish@cs.ualberta.ca> wrote: > > > Since a MapReduce is a separate process, try with a high Scan cache > value. > > > > http://hbase.apache.org/book.html#perf.hbase.client.caching > > > > Himanshu > > > > On Sun, Oct 9, 2011 at 9:09 AM, Ted Yu wrote: > > > I guess your hbase.hregion.max.filesize is quite high. > > > If possible, lower its value so that you have smaller regions. > > > > > > On Sun, Oct 9, 2011 at 7:50 AM, Rita wrote: > > > > > >> Hi, > > >> > > >> I have been doing a rowcount via mapreduce and its taking about 4-5 > > hours > > >> to > > >> count a 500million rows in a table. I was wondering if there are any > map > > >> reduce tunings I can do so it will go much faster. > > >> > > >> I have 10 node cluster, each node with 8CPUs with 64GB of memory. Any > > >> tuning > > >> advice would be much appreciated. > > >> > > >> > > >> -- > > >> --- Get your facts first, then you can distort them as you please.-- > > >> > > > > > > > > > -- > --- Get your facts first, then you can distort them as you please.-- > --0016e6d7e9210d99f104aedf8c00--