hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikael Sitruk <mikael.sit...@gmail.com>
Subject Re: Hbase schema design help
Date Tue, 14 Feb 2012 14:33:52 GMT
Why don't you prefix the columns with an execution date (reverse order so
the last execution is the first one?)
that is:
email id (row key) - (columns) appName:reportName,
appName:<executionDate>_startDate, appName:<executionDate>_endDate, appName:
<executionDate>_status

So all execution for a specific user are in the same row.

Or you can also use different pattern (growing in rows and not in columns),
where you use a sightly different key.
<email id><executionDate> (row key) - (columns) appName:reportName,
appName:startDate, appName:endDate, appName:status

But in this case you need a scan for getting a specific email id.

BTW: do you think your case is in the NoSql problem (i mean in volume)?

Mikael.S

On Tue, Feb 14, 2012 at 2:17 PM, Monish r <monishsvce@gmail.com> wrote:

> Hi,
>
>  U can set the max versions for that table as INTEGER.MAX , so that the
> records are identified uniquely by means of timestamp ( milliseconds ) in
> which they are inserted . In hbase each and every cell in the table is
> indexed so if u have more number of columns , u can store them as a
> concatenated string ( delimited of course )  into one column  for better
> write.
>
> Just a thought.
>
> - Monish
>
> On Tue, Feb 14, 2012 at 2:12 PM, Raj N <objectlinks@gmail.com> wrote:
>
> > Hi All,
> >
> > I am new to NoSQL world, I need help/suggestion to design Hbase schema
> for
> > the below requirement,
> >
> > It is a report generation application using hadoop. Now I want to store a
> > particular user's report history in Hbase. The user's email id will be
> used
> > to track all his previous ran report history. So the entities to be
> > persisted are, email id, report name, start date, end date and status.
> > I am planning to create schema as follows,
> >
> > email id (row key) - (columns) appName:reportName, appName:startDate,
> > appName:endDate, appName:status
> >
> > Query will be performed using email id (but I am OK for any other
> options).
> > The problem is, if the same user runs the same report again with
> different
> > date range, it will overwrite start date, end date and status columns.
> >
> > What is the right way of designing schema in this situation. Any help
> would
> > be greatly appreciated.
> >
> > Thanks in advance.
> >
> > -Raj
> >
>



-- 
Mikael.S

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message