hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric <eric.x...@gmail.com>
Subject Re: A way to monitor HDFS for a file to come live, and then kick off a job?
Date Tue, 29 Mar 2011 08:59:31 GMT
You can also use a FUSE mount and use a cronjob to check if new files
arrived. You may want to make sure to create a pid file that is checked so
you won't run the script again before the previous run finished.

2011/3/25 Allen Wittenauer <aw@apache.org>

> On Mar 24, 2011, at 10:09 AM, Jonathan Coveney wrote:
> > I am not sure if this is the right listserv, forgive me if it is not.
>         A better choice would likely be hdfs-user@, since this is really
> about watching files in HDFS.
> > My
> > goal is this: monitor HDFS until a file is create, and then kick off a
> job.
> > Ideally I'd want to do this continuously, but the file would be create
> > hourly (with some sort of variance). I guess I could make a script that
> > would ping the server every 5 minutes or something, but I was wondering
> if
> > there might be a more elegant way?
>         Two ways off the top of my head:
>        1) Read/watch the edits stream
>        2) Read/watch the HDFS audit log
>        Given the latter is text built by log4j, that should be relatively
> simple to implement.
> There was a JIRA asking for this functionally to be built in recently, btw.

View raw message