nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Tomblin <>
Subject My mistake
Date Thu, 13 Aug 2009 15:26:06 GMT
The patch I sent a few days ago doesn't work right, because when it's
fetching something that it's never seen before, datum.getFetchTime()
returns the *current* fetch time instead of the last fetch time.  When
it's fetching something that was fetched before, it returns the *last*
fetch time.  Obviously if you ask the web server for something that's
modified since *right*now*, it isn't going to return anything.

This whole problem would go away if datum.getModifiedTime worked.
When I dump the CrawlDatum out of the segment file, the modified time
is definitely in there, but datum.getModifiedTime() seems to always
return 0.  If I find out why that's happening, I'll send a patch.


View raw message