Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8CC4EDF39 for ; Tue, 7 Aug 2012 13:36:06 +0000 (UTC) Received: (qmail 20775 invoked by uid 500); 7 Aug 2012 13:36:04 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 20708 invoked by uid 500); 7 Aug 2012 13:36:04 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 20694 invoked by uid 99); 7 Aug 2012 13:36:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Aug 2012 13:36:04 +0000 X-ASF-Spam-Status: No, hits=3.0 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FROM_12LTRDOM,FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of eczech52@gmail.com designates 209.85.217.169 as permitted sender) Received: from [209.85.217.169] (HELO mail-lb0-f169.google.com) (209.85.217.169) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Aug 2012 13:35:58 +0000 Received: by lbon3 with SMTP id n3so3268688lbo.14 for ; Tue, 07 Aug 2012 06:35:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=zlIruztZ8oLhidfhLMIYbPFNZdwYkPlb57no6/rrdI0=; b=rytksqn5YFMoDVmZI8oNscHpCOLL5P2YkKYmpL7v76acrJ5oAsLbbzlG9XsPzVbIK6 OTdr/57fH45rUzlw2rIRMHGANrffSsGlMPuf0fuWCl0YmK7Y4XitDcT5qMlVfBmd0t8q C+i1wsKYrfhQ9RLrlZTfjB1C9CLvV47ubMw1CiPgMF/y4XBXYse/i36VrQDKeGpkc4uV qoIRnP3daI0ud6WreTiawObT1xYtM7XTd8iYadldaQyuawjVp2OK1qjzBcPFH8eI3ag5 JexrhYSthphtH9sDjJWwe64XficLVG6RLrPQd3yuKqFeC2vHV/cUMirnjmnfhHw20ALT tm5Q== Received: by 10.112.99.98 with SMTP id ep2mr6495506lbb.45.1344346536539; Tue, 07 Aug 2012 06:35:36 -0700 (PDT) MIME-Version: 1.0 Sender: eczech52@gmail.com Received: by 10.152.14.102 with HTTP; Tue, 7 Aug 2012 06:35:16 -0700 (PDT) From: Eric Czech Date: Tue, 7 Aug 2012 09:35:16 -0400 X-Google-Sender-Auth: P6pADM60lRaX-9DVXRixAXiq3nI Message-ID: Subject: Ideal row size To: user Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Hello everyone, I'm trying to store many small values in indexes created via MR jobs, and I was hoping to get some advice on how to structure my rows. Essentially, I have complete control over how large the rows should be as the values are small, consistent in size, and can be grouped together in any way I'd like. My question then is, what's the ideal size for a row in Hbase, in bytes? I'm trying to determine how to group my values together into larger values, and I think having a target size to hit would make that a lot easier. I know fewer rows is generally better to avoid the repetitive storage of keys, cfs, and qualifiers provided that those rows still suit a given application, but I'm not sure at what point the scale will tip in the other direction and I'll start to see undue memory pressure or compaction issues with rows that are too large. Thanks in advance!