hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesse Yates <jesse.k.ya...@gmail.com>
Subject Re: Scheduling Map Reduce Jobs
Date Mon, 23 Apr 2012 15:14:17 GMT
Counter question: why do you want to run M/R jobs to do aggregation? You
could do this insitu with a custom aggregation coprocessor. Essentially,
you would set a time span over which you would aggregate a row (or possibly
multiple rows, but then you have to be sure that they are on the same
region, which means using a custom split policy or pre-splitting and
turning splitting off all together). If you apply the CP at scan, flush and
compaction you should get the same behavior without all the messy IO. We
don't really have a good guide for how to do this kind of thing, but the
concept here is similar to what Accumulo does with

But to answer your original question, I use anything else than cron for
that kind of stuff (that's what its there for :).


Jesse Yates

On Mon, Apr 23, 2012 at 1:34 AM, apatro <arati.patro@gmail.com> wrote:

> Hi,
> I'd like to know if there is some alternative to using crons while
> scheduling Map Reduce jobs wherein one can incorporate one's own scheduling
> logic. For instance, to perform aggregation on table data on a particular
> hour of the day or a particular day in a week and the sorts.
> Thanks in advance :)
> Arati Patro
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/Scheduling-Map-Reduce-Jobs-tp3931839p3931839.html
> Sent from the HBase - Developer mailing list archive at Nabble.com.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message