nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kevin chen <kevinc...@bdsing.com>
Subject Add new segments to exsiting
Date Thu, 10 Jan 2008 04:34:49 GMT
Hi,

I have maintained a crawl sites and continued to discover new relevant
urls to add to crawl.

Here is what I did:

Once I find new urls, I crawl them separately for a few rounds until I
am satisfied. I then move the new segments to put them together with my
existing segments directory. Then I run "updatedb" for each new
segments. Then I remove the existing indexes and re-index all the
segments.

Is this the right way to do this? How does everybody work around this
scenario?

Thanks


Mime
View raw message