Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2DA1179F1 for ; Sun, 21 Aug 2011 15:59:46 +0000 (UTC) Received: (qmail 48020 invoked by uid 500); 21 Aug 2011 15:59:44 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 47859 invoked by uid 500); 21 Aug 2011 15:59:44 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 47851 invoked by uid 99); 21 Aug 2011 15:59:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Aug 2011 15:59:43 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of static.void.dev@gmail.com designates 209.85.210.173 as permitted sender) Received: from [209.85.210.173] (HELO mail-iy0-f173.google.com) (209.85.210.173) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Aug 2011 15:59:34 +0000 Received: by iyk2 with SMTP id 2so8013694iyk.18 for ; Sun, 21 Aug 2011 08:59:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; bh=FdEgzex9LMeGAfe7QFtDmwC9EPNVvmlsuo0Edz0p618=; b=wUwAwzN6WLTMZZ7qhYi+lTX9luX1au25J35fQMlLdlSeOKx5Wodrm0cgRxHRsi07Wk LGt62Suy0TnfrksSsiCD3zDJdfofky1vdXSNLbuE3qZ4PvofJ4DYJzhw3YbE6r5OCuPB l8iud4+dt9u4MjinfHZ+k5CfAJwj7pr3cQFJs= Received: by 10.231.84.18 with SMTP id h18mr3676324ibl.12.1313942353675; Sun, 21 Aug 2011 08:59:13 -0700 (PDT) Received: from Roberts-MacBook-Pro.local (c-76-103-169-217.hsd1.ca.comcast.net [76.103.169.217]) by mx.google.com with ESMTPS id j9sm229229ibl.1.2011.08.21.08.59.12 (version=SSLv3 cipher=OTHER); Sun, 21 Aug 2011 08:59:12 -0700 (PDT) Message-ID: <4E512B4F.2080208@gmail.com> Date: Sun, 21 Aug 2011 08:59:11 -0700 From: Mark User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:6.0) Gecko/20110812 Thunderbird/6.0 MIME-Version: 1.0 To: user@hbase.apache.org Subject: Number of tables Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org We are logging all user actions into hbase. These actions include searches, product views and clicks. We are currently storing them in one table with row keys like so: "#{type}/#{user}/#{time}", where type is either click, search, view and user is the current user logged in. Obviously using this method lead to region hot spotting as the start of each key is fairly static. This got me to thinking on what alternatives ways I could model this type of data and I was hoping I could get some suggestions from the community. Which would be more advisable? 1) Keep the current all logs go to one table pattern that is describe above. 2) Keep the current all logs go to one table pattern that is describe above but switch the type and user fields which would lead to more randomized keys thus reducing hot spots 3) Create separate tables for each type of log we are saving... ie have search table, click table, view table. Our use case does not require us searching across multiple types so I'm leaning towards #3 now but I was wondering if there were any cons to using this method? Is it worse to have more tables than less? Thanks for help -M