Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hbase-user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of edlinuxguru@gmail.com
 designates 216.239.58.185 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type:content-transfer-encoding;
        b=WwxVc+qzJyvgI8hYMKhqwHpQ2boobAu451b1ma0+jscRJ2gYz6axEqhZIAyTOzR/DE
         gdf+bCD+E4ukh1qKD6XTCMArroeJS5fRWihTmvpIPYAe/yA47qrUh9uuYVXHPGKhumPB
         s7WwrNt1BPSZoUbKBpi/D4cRFEeoNcX4vRTuQ=
MIME-Version: 1.0
In-Reply-To: <7c457ebe1001202209i320cfde6sfb0b6cc881aaaf8@mail.gmail.com>
References: <27252203.post@talk.nabble.com>
	 <7c457ebe1001202209i320cfde6sfb0b6cc881aaaf8@mail.gmail.com>
Date: Thu, 21 Jan 2010 11:17:35 -0500
Message-ID: <cbbf4b571001210817t178baaa0xcf6b5f4c21c58e4d@mail.gmail.com>
Subject: Re: learning hbase - schema design advice
From: Edward Capriolo <edlinuxguru@gmail.com>
To: hbase-user@hadoop.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Thu, Jan 21, 2010 at 1:09 AM, Dan Washusen <dan@reactive.org> wrote:
> Have you read the bigtable paper linked off the front page of HBase? =A0I=
t
> does a good job of explaining the concepts. =A0Basically it's a distribut=
ed
> sorted map (think java.util.NavigableMap but split over many machines). =
=A0If
> you know the key of the row you are looking for HBase can fetch it very
> quickly. =A0If you don't know the key you'll have to resort to scanning a=
ll
> the rows to find the data you are interested in (just like a SQL query th=
at
> can't take advantage of an index)...
>
> Do the queries need to immediately reflect any writes or is it sufficient
> for them to become eventually consistent? =A0If you can live with eventua=
l
> consistency then you could write some map reduce jobs that duplicate a
> master table into reporting tables (like you would for data
> warehousing/reporting on a RDMS).
>
> I'm sure some of the more experienced users will have more insight but th=
at
> might get you started...
>
> Cheers,
> Dan
>
> p.s. bold text doesn't seem to come through the mailing list...
>
> 2010/1/21 canucks <anhlon@gmail.com>
>
>>
>> Hi,
>>
>> i'm pretty interested in learning hbase. =A0what i want to do is store
>> financial data for analytical/graphing/displaying purposes. =A0there hun=
dreds
>> of millions of rows and of course, i want fast response when retrieving =
the
>> data.
>>
>> if i were to do it in a RDBMS it would be
>> REPORT, MARKET, OPERATING_DATE, OPERATING_INTERVAL, =A0 =A0 HOUR_ENDING
>> VALUE
>> where the bolded column name are PK. =A0if i were to store this in hbase
>> would
>> it look like this?
>>
>> REPORT.MARKET.OPERATING_DATE.OPERATING_INTERVAL.HOUR_ENDING.TIMESTAMP{
>> =A0 =A0 =A0 =A0VALUE: 92.29
>> }
>>
>> so that i can do queries like below:
>> - give me all reports with the name of "ABC"
>> - give me all the values where OPERATING_DATE is from jan-01-2010 to
>> jan-10-2010
>> - give me all the values where OPERATING_DATE is from jan-01-2010 to
>> jan-10-2010 and HOUR_ENDING is between 5 and 10 (or simply 5 or variatio=
ns
>> thereof)
>>
>> in short, is hbase the wrong way to go about it or would it yield better
>> performance? =A0also, you folks happen to know any good links/articles o=
n
>> hbase table & schema?
>>
>> thanks
>> --
>> View this message in context:
>> http://old.nabble.com/learning-hbase---schema-design-advice-tp27252203p2=
7252203.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>
>>
>
I went looking for a paper "how to convert my RDBMS mindset to a
key-value store midset" Here is something that got me started.

http://s-expressions.com/2009/03/08/hbase-on-designing-schemas-for-column-o=
riented-data-stores/