hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Cafarella" <michael.cafare...@gmail.com>
Subject Re: HBase Design Ideas, Part II
Date Mon, 05 Jun 2006 07:54:01 GMT
Arkady,

Sorry for the delay in responding to your message.

I think the proposal would handle your log-reporting problem quite
well.  My current plan for iterators that process a lot of rows (instead
of querying for them, or updating just a few) is to ask the user to
implement a MapReduce-like class or method.  This method is submitted
to the system, and is invoked for every tuple that passes the query
criteria.

Imagine the user implements an interface like this:

class UserIterator implements HDataIterator {
  void process(String row, String col, String attr) {};
}

and submits it using a query interface like this:

boolean submitIterator(HDataIterator it, HQuery query);

where HQuery describes the rows for which DataIterator.process()
will be invoked.  The user class can store state across invocations,
making it quite easy to compute sums/averages for log summaries.  I
think this is a better way to do aggregates than a fancy query language,
at least until the system and requirements are better-understood.

BTW, for a completely different look at how to do log processing,
you might take a look at the following quite interesting paper:

*PADX*: Querying large-scale ad hoc data with
XQuery<http://scholar.google.com/url?sa=U&q=http://www.cs.princeton.edu/%7Eyitzhakm/publications/padx_planx.pdf>
M Fernandez, K Fisher, Y Mandelbaum - Submitted to PLAN-X, 2006 -
cs.princeton.edu
http://scholar.google.com/url?sa=U&q=http://www.cs.princeton.edu/~yitzhakm/publications/padx_planx.pdf

Best,
--Mike


On 5/19/06, Arkady Borkovsky <arkady@inktomi.com> wrote:
>
>
> Some potential applications of HBase-like functionality are
> batch-oriented: select (a large) subset of records, process a subset of
> their columns (often a small subset), and either add either new
> columns, or new versions of columns.
> Several applications like this may be running on the same "table"
> simultaneously.

Think, for example, about query log processing, where the key is a
> search query, and the daily stats are represented as columns (several
> columns per day), plus columns for data aggregated by week, by month,
> etc.
>
> How well does the proposal address this kind of uses?
>
> -- ab
>
> On May 15, 2006, at 9:54 PM, Michael Cafarella wrote:
>
> > Hi everyone,
> >
> > My previous mail mentioned a bunch of design ideas that were mainly
> > lifted from Jeff Dean's BigTable talk.  BigTable seems like a useful
> > way to do large-scale row storage, and their decisions largely seem
> > like the right ones.
> >
> > However, BigTable still leaves some things on the table.  Items to
> > improve include a query language and multi-row locking, among
> > other things.
> >
> > Dean said explicitly in his talk that they wanted to avoid multirow
> > locking because it's complicated, error-prone, and maybe not necessary.
> > He's right on at least the first two, and maybe the third.
> >
> > Multiple row locks are useful when you're making a change to
> > several rows that should be atomic; you want all the changes
> > or none of the changes.  It's also used in traditional databases
> > if you want to perform an expensive read operation (like a
> > multiway join) and you want to make sure the results don't
> > get modified while you're reading.
> >
> > Distributed lock acquisition is very hard to do.  It's bug-prone
> > and often has very weird performance ramifications.  It's
> > difficult to get working, difficult to tune, difficult to everything.
> >
> > Here are a few ideas on what to do:
> > 1)  Suck it up and have the client acquire locks on multiple
> > HRegionServers simultaneously.  All clients would have to
> > agree to acquire locks according to some global ordering to
> > avoid deadlock.  HRegions would not be allowed to migrate
> > to a new server if locked.
> >
> > If this is a rare circumstance, a better approach would be
> > to have a dedicated "lock acquirer" through which clients
> > make requests.  It doesn't help the theoretical problem here,
> > but it would make debugging an awful lot easier.
> >
> > 2)  In the case of long-lasting read operations, we can
> > use versioning to guarantee consistency.  If each row is
> > annotated with an edit timestamp, and we know that there
> > is sufficient version history available, the long-lasting job
> > can run over a specific version only.
> >
> > Edits can continue to be made to the database while the
> > read-only job is ongoing.  The operation is performed over
> > the database as of the time the task was submitted.
> >
> > 3) In the case of multiple row updates, we may be able to
> > use different edit semantics to avoid locking.  For example,
> > consider that we want to add a single column/value pair to
> > multiple rows.  We want this to happen atomically, so that
> > both rows get the value or neither of them do so.
> >
> > If it's just an add, then we don't need to lock the rows at
> > all; the add will always succeed, even if other writes
> > intervene. Traditionally there's been no difference between
> > among data "updates", so they all require locking.  If we
> > can get a client to adjust the update semantics slightly,
> > then the locking can be much more relaxed.
> >
> > I'd say that "add" or "append" semantics are likely to be
> > at least as common as "edit" semantics.
> >
> > Can you think of the family of edit semantics you'd like
> > to see offered here?
> >
> > Also, how useful do you think a general-purpose query language
> > would be for HBase?  It would be fairly straightforward to implement,
> > for example, a poor man's version of SQL that has different locking
> > and update behavior (and which chucks out the more exotic elements).
> > This might be compiled into a piece of code that is executed
> > immediately, or it might be transformed into a long-lasting mapreduce
> > job.
> >
> > I have a few ideas for such a language, but I'm worried it's getting
> > a little far afield from what we're interested in for Hadoop.
> >
> > --Mike
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message