Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 12779 invoked from network); 13 Jul 2009 10:36:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Jul 2009 10:36:57 -0000 Received: (qmail 68955 invoked by uid 500); 13 Jul 2009 10:37:06 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 68920 invoked by uid 500); 13 Jul 2009 10:37:06 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 68910 invoked by uid 99); 13 Jul 2009 10:37:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jul 2009 10:37:05 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of piyushgoel84@gmail.com designates 209.85.216.171 as permitted sender) Received: from [209.85.216.171] (HELO mail-px0-f171.google.com) (209.85.216.171) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jul 2009 10:36:58 +0000 Received: by pxi1 with SMTP id 1so3080672pxi.5 for ; Mon, 13 Jul 2009 03:36:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=ikXyKqPwCeOTURSZthKIux27abbjnUM+hkjOr42EAQg=; b=D40z4uKbNYIX+o7vgYVEKDIoV0c4VH1HCLFdJ4mn9BnA3rEL3BbmEPN4/09KRxRfKS 7lkbBT3oIcO4XpsqQXq7MQsSr3rtl/jX7KvNC+06PM96V67onTw+al2K/dLsDTboNk2Y duMfTOCbPoAGdxfzixZsV0nG2yLu/IP1Ys0w8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=MXRV7NokYg1ZMygnbeK4ee6ogo7RMU3pnOLgSnuBGN/bvcsdR3NbMamPodkb6ragfg ApRepdOrxxq4Hdpseb2yLfjIQGcNKJcHTEKhPGeSMJQL39DAqIVUQiF59/+th1Lci4Rm u6TbPuq+64qynuOINv2P3U4bmykQK+sjuE8CQ= MIME-Version: 1.0 Received: by 10.114.146.4 with SMTP id t4mr8550988wad.228.1247481397826; Mon, 13 Jul 2009 03:36:37 -0700 (PDT) Date: Mon, 13 Jul 2009 16:06:37 +0530 Message-ID: Subject: help needed with base schema From: Piyush Goel To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=00163645899694ddd1046e93e50c X-Virus-Checked: Checked by ClamAV on apache.org --00163645899694ddd1046e93e50c Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Hi, > > > I am trying to design a high scale key value storage system. The hbase > table for the same is outlined below: > > { > "userid1" : { > "update" : { > t3 : "some update1", > t2 : "some update2", > t1 : "some update3" > }, > "sender" : { > t3 : "sender3" > t2 : "sender2" > t1 : "sender1" > }, > > "userid2" : { > "update" : { > t9 : "some update9", > t6 : "some update534", > t1 : "some update343" > }, > "sender" : { > t9 : "sender3" > t6 : "sender2" > t1 : "sender1" > }, > > > } > > The system is going to have around 15-20M users with around 3-4M put write > operations per day (which rules out mysql automatically). The max number of > entries in "update" and "sender" columns will be around 1000 (around 1 > weeks updates) > > My queries would be like "For a given userid, return top 20 updates, > senders based on timestamp". Is there a way to make a secondary index on > "userid, timestamp" which can help speed up my "get" calls? Or how can I > change my schema design to minimize response time for get calls ? > > > thanks, piyush --00163645899694ddd1046e93e50c--