Quantcast

Scheduling Map Reduce Jobs

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Scheduling Map Reduce Jobs

apatro
Hi,

I'd like to know if there is some alternative to using crons while scheduling Map Reduce jobs wherein one can incorporate one's own scheduling logic. For instance, to perform aggregation on table data on a particular hour of the day or a particular day in a week and the sorts.

Thanks in advance :)

Arati Patro
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Scheduling Map Reduce Jobs

Jesse Yates
Counter question: why do you want to run M/R jobs to do aggregation? You
could do this insitu with a custom aggregation coprocessor. Essentially,
you would set a time span over which you would aggregate a row (or possibly
multiple rows, but then you have to be sure that they are on the same
region, which means using a custom split policy or pre-splitting and
turning splitting off all together). If you apply the CP at scan, flush and
compaction you should get the same behavior without all the messy IO. We
don't really have a good guide for how to do this kind of thing, but the
concept here is similar to what Accumulo does with
iterators<http://accumulo.apache.org/1.4/examples/combiner.html>
.

But to answer your original question, I use anything else than cron for
that kind of stuff (that's what its there for :).

-Jesse

-------------------
Jesse Yates
240-888-2200
@jesse_yates
jyates.github.com


On Mon, Apr 23, 2012 at 1:34 AM, apatro <[hidden email]> wrote:

> Hi,
>
> I'd like to know if there is some alternative to using crons while
> scheduling Map Reduce jobs wherein one can incorporate one's own scheduling
> logic. For instance, to perform aggregation on table data on a particular
> hour of the day or a particular day in a week and the sorts.
>
> Thanks in advance :)
>
> Arati Patro
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/Scheduling-Map-Reduce-Jobs-tp3931839p3931839.html
> Sent from the HBase - Developer mailing list archive at Nabble.com.
>
Loading...