nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kevin chen <>
Subject Add new segments to exsiting
Date Thu, 10 Jan 2008 04:34:49 GMT

I have maintained a crawl sites and continued to discover new relevant
urls to add to crawl.

Here is what I did:

Once I find new urls, I crawl them separately for a few rounds until I
am satisfied. I then move the new segments to put them together with my
existing segments directory. Then I run "updatedb" for each new
segments. Then I remove the existing indexes and re-index all the

Is this the right way to do this? How does everybody work around this


View raw message