Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 79062E7D2 for ; Sat, 26 Jan 2013 00:14:31 +0000 (UTC) Received: (qmail 27558 invoked by uid 500); 26 Jan 2013 00:14:29 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 27514 invoked by uid 500); 26 Jan 2013 00:14:29 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 27505 invoked by uid 99); 26 Jan 2013 00:14:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 26 Jan 2013 00:14:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yuzhihong@gmail.com designates 74.125.82.42 as permitted sender) Received: from [74.125.82.42] (HELO mail-wg0-f42.google.com) (74.125.82.42) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 26 Jan 2013 00:14:23 +0000 Received: by mail-wg0-f42.google.com with SMTP id 12so83028wgh.1 for ; Fri, 25 Jan 2013 16:14:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=RppsR+axJWQQwX8T0VB5yY6SOe0d55x2AMRTh0nwI2o=; b=Cua0dVN6QRgVrK3w/3KEZtDmv9skLMvCBhsveLDEMJ8ScJMtVgF/WiTxc3+M4ieW45 ie9/L+wEquSJn/tvenKBgUTOs9pG4OUwubE4v3u9hBFxE1jKxdaSnhLwAT7uWqk9CwGe SQpNaJNmIWGjcBg+4et9yunGuSYmKjZoE3k3iEA5/x5Ma6VloTEvCS0vrdqjlG4f3wS2 YDQPfXRQByO1ScOOXxrHMrmptP/Wta9qR9aA2TFF9a3E0QJu9c2BfS/MHMPCOoGBtXpS GElZASsoKHmyGSpXFpcOOWwplL4i3OQEvRwOlL41F9dayTZD62GmmVeN35L/Mk+cRoIA qlnQ== MIME-Version: 1.0 X-Received: by 10.180.19.136 with SMTP id f8mr348985wie.0.1359159242744; Fri, 25 Jan 2013 16:14:02 -0800 (PST) Received: by 10.216.73.194 with HTTP; Fri, 25 Jan 2013 16:14:02 -0800 (PST) In-Reply-To: References: Date: Fri, 25 Jan 2013 16:14:02 -0800 Message-ID: Subject: Re: Rule of thumb: Size of data to send per RPC in a scan From: Ted Yu To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=bcaec53d5aa3db640e04d425eb9e X-Virus-Checked: Checked by ClamAV on apache.org --bcaec53d5aa3db640e04d425eb9e Content-Type: text/plain; charset=ISO-8859-1 Looks like HBASE-2214 'Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly' may help you. But that is only in 0.96 Lars H presented some performance numbers in: HBASE-7008 Set scanner caching to a better default, disable Nagles where default for "hbase.client.scanner.caching" changed to 100 Cheers On Fri, Jan 25, 2013 at 3:59 PM, David Koch wrote: > Hello, > > Is there a rule to determine the best batch/caching combination for > maximizing scan performance as a function of KV size and (average) number > of columns per row key? > > I have 0.5kb per value (constant), an average of 10 values per row key - > heavy tailed so some outliers have 100k KVs, around 100million rows in the > table. The cluster consists of 30 region servers, 24gb of RAM each, nodes > are connecting with a 1gbit connection. I am running Map/Reduce jobs on the > table, also with 30 task trackers. > > I tried: > cache: 1, no batching -> 14min > cache 1000, batch 50 -> 11min > cache 5000, batch 25 -> crash (timeouts) > cache 2000, batch 25 -> 15min > > Job time can vary quite significantly according to whatever activity > (compactions?) are going on in the background. Also, I cannot probe for the > best combination indefinitely since there actual production jobs queued. I > did expect a larger speed-up with respect to no caching/batching at all - > is this unjustified? > > In short, I am looking for some tips for making scans in a Map/Reduce > context faster :-) > > Thank you, > > /David > --bcaec53d5aa3db640e04d425eb9e--