hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans" <jdcry...@gmail.com>
Subject Re: Table Updates with Map/Reduce
Date Sat, 19 Jul 2008 03:29:25 GMT
Brian (guessing it's your name from your email address),

Please be more specific about your table design. For example, a "column" in
HBase is a very vague word since it may refer to a column family or a column
key inside a column family. Also, what kind of load you expect to have?

Maybe answering to this will also help you understanding HBase.



On Fri, Jul 18, 2008 at 4:41 PM, imbmay <brian@media6degrees.com> wrote:

> I want to use hbase to maintain a very large dataset which needs to be
> updated pretty much continuously.  I'm creating a record for each entity
> and
> including a creation timestamp column as well as between 10 and 1000
> additional columns named for distinct events related to the record entity.
> Being new to hbase the approach I've taken is to create a map/reduce app
> that for each input record:
> Does a lookup in the table using HTable get(row, column) on the timestamp
> colum to determine if there is an existing row for the entity.
> If there is no existing record for the entity, the event history for the
> entity is added to the table with one column added per unique event id.
> If there is an existing record for the entity, it just adds the most recent
> event to the table.
> I'd like feedback as to whether this is a reasonable approach in terms of
> general performance and reliability or if there is a different pattern
> better suited to hbase with map/reduce or if I should even be using
> map/reduce for this.
> Thanks in advance.
> --
> View this message in context:
> http://www.nabble.com/Table-Updates-with-Map-Reduce-tp18537368p18537368.html
> Sent from the HBase User mailing list archive at Nabble.com.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message