hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From edward choi <mp2...@gmail.com>
Subject Re: how to run jobs every 30 minutes?
Date Tue, 14 Dec 2010 05:26:55 GMT
Thanks for the tip. I took a look at it.
Looks similar to Cascading I guess...?
Anyway thanks for the info!!

Ed

2010/12/8 Alejandro Abdelnur <tucu@cloudera.com>

> Or, if you want to do it in a reliable way you could use an Oozie
> coordinator job.
>
> On Wed, Dec 8, 2010 at 1:53 PM, edward choi <mp2893@gmail.com> wrote:
> > My mistake. Come to think about it, you are right, I can just make an
> > infinite loop inside the Hadoop application.
> > Thanks for the reply.
> >
> > 2010/12/7 Harsh J <qwertymaniac@gmail.com>
> >
> >> Hi,
> >>
> >> On Tue, Dec 7, 2010 at 2:25 PM, edward choi <mp2893@gmail.com> wrote:
> >> > Hi,
> >> >
> >> > I'm planning to crawl a certain web site every 30 minutes.
> >> > How would I get it done in Hadoop?
> >> >
> >> > In pure Java, I used Thread.sleep() method, but I guess this won't
> work
> >> in
> >> > Hadoop.
> >>
> >> Why wouldn't it? You need to manage your post-job logic mostly, but
> >> sleep and resubmission should work just fine.
> >>
> >> > Or if it could work, could anyone show me an example?
> >> >
> >> > Ed.
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >> www.harshj.com
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message