nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ".: Abhishek :." <>
Subject Running crawls between a specified time interval
Date Wed, 09 Feb 2011 01:17:01 GMT
Hi all,

 I am just trying to figure out if there is some way I can set Nutch crawls
between a time interval say like crawl from 12:00 AM to 12:00 PM and then
start the further processing(start process of indexing and so on that
follows the crawl) after that.

 I think Nutch job is tied to Hadoop's JobConf. I am not sure on  how this
could be done. Rather, if I am to use an external shell script for doing
this, how do I chain the crawl process and trigger further processing after


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message