Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2AA42298A for ; Sat, 23 Apr 2011 09:26:15 +0000 (UTC) Received: (qmail 80476 invoked by uid 500); 23 Apr 2011 09:26:13 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 80448 invoked by uid 500); 23 Apr 2011 09:26:13 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 80440 invoked by uid 99); 23 Apr 2011 09:26:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Apr 2011 09:26:13 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of bernd.fondermann@googlemail.com designates 209.85.214.41 as permitted sender) Received: from [209.85.214.41] (HELO mail-bw0-f41.google.com) (209.85.214.41) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Apr 2011 09:26:07 +0000 Received: by bwz17 with SMTP id 17so1273356bwz.14 for ; Sat, 23 Apr 2011 02:25:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding; bh=yf+TSiZaQCASFdDGOCsdAyP9Pcy7ycfb3y7i4HNaVOg=; b=Cqecn71MNanhw98K8cozYWzyO52gr0V+8WttId81Ftl5M28k3JpQDJ5iCzNb4fieZM 0+ZIJ/A5gpjNyNDb2LAknXkUudg+baXS5FlgwqNA0URhL9Qn+k5uHhix6HORTVKlkHNs Z96UW31JSjVFzlezkP/6cRqmv8kRQRFPBTlCE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=m9O18e4flx/ETFyq8PSQEedskEHgYhDknbfPNbmhMrSMHhufflgSPLBc4hSjBvKQ0l T69EfoPYR8/4kj2fkABvtHoN3qReDKkS6pVppmsiONXKAZIQikpVvhgPaPpXdN5NQ7IA XWcgmEnx/FXzfHZQHfIN82YGjTWhQc0wjE780= MIME-Version: 1.0 Received: by 10.204.18.193 with SMTP id x1mr1635059bka.79.1303550747101; Sat, 23 Apr 2011 02:25:47 -0700 (PDT) Received: by 10.204.58.129 with HTTP; Sat, 23 Apr 2011 02:25:47 -0700 (PDT) In-Reply-To: References: Date: Sat, 23 Apr 2011 11:25:47 +0200 Message-ID: Subject: Re: HBase - Column family From: Bernd Fondermann To: user@hbase.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org That's how I would do it: What's nice in HBase is that you can store all the data for one of your keywords in a single row. Create a column family "doc_id". Now, for each word, you create one row. In this row, for each matching document you create one column (that's the gotcha compared to a RDB design). The name of the column is the doc id. The column's cell content is the weig= ht. So, following your example you'd get: row id | column-family:column.... HELLO | doc_id:2 | doc_id:3 | doc_id:4 and column values: doc_id:2 | doc_id:3 | doc_id:4 12 |=A045 |=A036 HTH, Bernd On Sat, Apr 23, 2011 at 09:56, JohnJohnGa wrote: > Hi, I'm a beginner in HBase. I need to design my table. I want to play wi= th the > following information: > > At the date XX-XX-XXXX, the word 'HELLO' is in document 2,3,4 and the wei= ght of > each doc is 12,45,36 - My raw data: doc:D title:'i like potatoes',weight:= W,date:D > > I created a table with, row: word, column:date, value:doc But I can't sto= re > multiple row with the same date, for the same word. > > Can we create multiple column families for a table? What can be the best = way to > design the schema? > > Thanks a lot > >