Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@locus.apache.org Received: (qmail 21573 invoked from network); 14 Jul 2008 19:58:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 14 Jul 2008 19:58:50 -0000 Received: (qmail 16648 invoked by uid 500); 14 Jul 2008 19:58:49 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 16635 invoked by uid 500); 14 Jul 2008 19:58:49 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 16624 invoked by uid 99); 14 Jul 2008 19:58:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Jul 2008 12:58:49 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=FUZZY_CPILL,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jdcryans@gmail.com designates 74.125.46.29 as permitted sender) Received: from [74.125.46.29] (HELO yw-out-2324.google.com) (74.125.46.29) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Jul 2008 19:57:54 +0000 Received: by yw-out-2324.google.com with SMTP id 9so2890014ywe.29 for ; Mon, 14 Jul 2008 12:57:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=vyCEDwWODE8uCwzyCKa5m8ijU0eelnE82uT4NmHp0U8=; b=B4KftWf1nYQD81HLKb4nP1wJDRvigm8FtlfJ1E6x8d4q22A+Ldr/G7/bM18Oxdof5s NWJ5sC5t8SiMyuwaKuSm2qccBAfCcDVtkg17U8EVwcLwkf+VOpyebM0zsFPG7aukfv7p UhoTaFd6ozGgE9XyCrmdwzY/tWUsN7tBpcEso= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=rNWdaiRX5QHsv/HuOt1gYvq6gGYvKJP5wUQFLVIIAWp6t9DkazcvenltNuYAACTWKK vB6h3ewU/j+rCXWXGyeZ2dW8sAQbDGfyVVfXefjAP2LFIJ1mznMlCgHU4tq9A/bD0mrG +2wR4ySHC60jRPw8t+ycYPqy1f01VE/wC/lHU= Received: by 10.114.255.1 with SMTP id c1mr13041949wai.67.1216065470811; Mon, 14 Jul 2008 12:57:50 -0700 (PDT) Received: by 10.114.93.14 with HTTP; Mon, 14 Jul 2008 12:57:50 -0700 (PDT) Message-ID: <31a243e70807141257o5ae5f334ia0ff501d2f3daaeb@mail.gmail.com> Date: Mon, 14 Jul 2008 15:57:50 -0400 From: "Jean-Daniel Cryans" To: hbase-user@hadoop.apache.org Subject: Re: Sorted columns In-Reply-To: <7e536b1f0807141236i1c0d935bifb416175aa5f37b8@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_69032_16720710.1216065470786" References: <7e536b1f0807141236i1c0d935bifb416175aa5f37b8@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_69032_16720710.1216065470786 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Marcus, The one thing you misunderstood is that the row key is not a column and I guess this is caused by a RDBMS background ;) The reason why you want to store reverted urls is that you want to have a fast scanner e.g. if you fetch 30 lines and they are distributed on 30 different machines, the performance will suffer. To search on column families, you have to build search tables using MapReduce or use external indexes that I guess are familiar for you. Hope it helps, J-D On Mon, Jul 14, 2008 at 3:36 PM, Marcus Herou wrote: > Hi guys. > > A simple question: Is only the row key sorted in HBase ? > > What if you would like to obtain a scanner based on another column ? I > thought the "auto" sorted feature was one of the reasons you would like to > store for example urls in a reverted manner. > > Have I misunderstood something ? > > We did choose Hbase as our db for storage of a billion urls but not being > able to search efficiently makes the choice harder... > > Kindly > > //Marcus > > -- > Marcus Herou CTO and co-founder Tailsweep AB > +46702561312 > marcus.herou@tailsweep.com > http://www.tailsweep.com/ > http://blogg.tailsweep.com/ > ------=_Part_69032_16720710.1216065470786--