Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CFA7010D9C for ; Mon, 26 Aug 2013 09:20:57 +0000 (UTC) Received: (qmail 55045 invoked by uid 500); 26 Aug 2013 09:20:54 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 54970 invoked by uid 500); 26 Aug 2013 09:20:54 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 54878 invoked by uid 99); 26 Aug 2013 09:20:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Aug 2013 09:20:53 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of pavan0591@gmail.com designates 209.85.214.175 as permitted sender) Received: from [209.85.214.175] (HELO mail-ob0-f175.google.com) (209.85.214.175) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Aug 2013 09:20:47 +0000 Received: by mail-ob0-f175.google.com with SMTP id xn12so2962059obc.6 for ; Mon, 26 Aug 2013 02:20:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=kxnEshKRrPwKdkppxS+zrRzhh1VEHCwdiCfsaKvUPgE=; b=gUnYLIuM3snQYC1X0bs+pe8OSBST5oFOITqyzKMn5aXcMX7Zgf9ffyhHnh3eiXX84z NYlUTdDJvKVU+1FVjKAY+gub3+7R0u+IUOpF6clTcOTn4ra0IB3WDnpAR4drDW5emGcu qGjMqHaFRMDD5kXhZogRSa/ef8FvG62POcq/pNVfdDtkJCa6YR5scS3BddWNrEOcGW/w XQPNFPXG+LthTOrFL6VE66ch5QCX6HpiptL37XsCpOKcaOC+eSbfdzMQxow6WodFCkxp oWJ90n07TROj119cpeTaz+161JjpYQDQ9AERZTTh3pwvShDVAH8sY0OV4KjhF3N6V3jW gM4Q== X-Received: by 10.60.51.196 with SMTP id m4mr13306218oeo.1.1377508826463; Mon, 26 Aug 2013 02:20:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.176.106 with HTTP; Mon, 26 Aug 2013 02:20:06 -0700 (PDT) In-Reply-To: References: From: Pavan Sudheendra Date: Mon, 26 Aug 2013 14:50:06 +0530 Message-ID: Subject: Re: Input split for a HBase of 80,000 rows? To: ashwanthkumar Cc: user@hbase.apache.org Content-Type: multipart/alternative; boundary=001a11c3092046c3a404e4d6444a X-Virus-Checked: Checked by ClamAV on apache.org --001a11c3092046c3a404e4d6444a Content-Type: text/plain; charset=ISO-8859-1 Further more, what can we do if a table has 25 online regions? Can we safely set caching to a bigger number? Is a split necessary as well? On Mon, Aug 26, 2013 at 2:42 PM, Pavan Sudheendra wrote: > Hi Ashwanth, thanks for the reply.. > > I went to the HBase Web UI and saw that my table had 1 Online Regions.. > Can you please guide me as to how to do the split on this table? I see the > UI asking for a region key and a split button... How many splits can i make > exactly? Can i give two different 'keys' and assume that the table is now > split into 3? One from beginning to key1, key1 to key2 and key2 to the rest? > > > On Mon, Aug 26, 2013 at 2:36 PM, Ashwanth Kumar < > ashwanthkumar@googlemail.com> wrote: > >> setCaching is setting the value via API, other way is to set it in the >> job configuration using the Key "hbase.client.scanner.caching". >> >> I just realized, given that you have just 1 region Caching wouldn't help >> much in reducing the time. Splitting might be an ideal solution. Based on >> your Heap space for every Mapper task try playing with that 1500 value. >> >> Word of caution, if you increase it too much, you might see >> ScannerTimeoutException in your TT Logs. >> >> >> On Mon, Aug 26, 2013 at 2:29 PM, Pavan Sudheendra wrote: >> >>> Hi Ashwanth, >>> My caching is set to 1500 .. >>> >>> scan.setCaching(1500); >>> scan.setCacheBlocks(false); >>> >>> Can i set the number of splits via an API? >>> >>> >>> On Mon, Aug 26, 2013 at 2:22 PM, Ashwanth Kumar < >>> ashwanthkumar@googlemail.com> wrote: >>> >>>> To answer your question - Go to HBase Web UI and you can initiate a >>>> manual >>>> split on the table. >>>> >>>> But, before you do that. May be you can try increasing your client >>>> caching >>>> value (hbase.client.scanner.caching) in your Job. >>>> >>>> >>>> On Mon, Aug 26, 2013 at 2:09 PM, Pavan Sudheendra >>> >wrote: >>>> >>>> > What is the input split of the HBase Table in this job status? >>>> > >>>> > map() completion: 0.0 >>>> > reduce() completion: 0.0 >>>> > Counters: 24 >>>> > File System Counters >>>> > FILE: Number of bytes read=0 >>>> > FILE: Number of bytes written=216030 >>>> > FILE: Number of read operations=0 >>>> > FILE: Number of large read operations=0 >>>> > FILE: Number of write operations=0 >>>> > HDFS: Number of bytes read=116 >>>> > HDFS: Number of bytes written=0 >>>> > HDFS: Number of read operations=1 >>>> > HDFS: Number of large read operations=0 >>>> > HDFS: Number of write operations=0 >>>> > Job Counters >>>> > Launched map tasks=1 >>>> > Data-local map tasks=1 >>>> > Total time spent by all maps in occupied slots >>>> (ms)=3332 >>>> > Map-Reduce Framework >>>> > Map input records=45570 >>>> > Map output records=45569 >>>> > Map output bytes=4682237 >>>> > Input split bytes=116 >>>> > Combine input records=0 >>>> > Combine output records=0 >>>> > Spilled Records=0 >>>> > CPU time spent (ms)=1142950 >>>> > Physical memory (bytes) snapshot=475811840 >>>> > Virtual memory (bytes) snapshot=1262202880 >>>> > Total committed heap usage (bytes)=370343936 >>>> > >>>> > >>>> > My table has 80,000 rows.. >>>> > Is there any way to increase the number of input splits since it takes >>>> > nearly 30 mins for the map tasks to complete.. very unusual. >>>> > >>>> > >>>> > >>>> > -- >>>> > Regards- >>>> > Pavan >>>> > >>>> >>>> >>>> >>>> -- >>>> >>>> Ashwanth Kumar / ashwanthkumar.in >>>> >>> >>> >>> >>> -- >>> Regards- >>> Pavan >>> >> >> >> >> -- >> >> Ashwanth Kumar / ashwanthkumar.in >> >> > > > -- > Regards- > Pavan > -- Regards- Pavan --001a11c3092046c3a404e4d6444a--