nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From YourSoft <>
Subject Number of searchabe pages
Date Sat, 14 May 2005 06:42:35 GMT
Dear List,

I counted the pages in the segments:
  bin/nutch segread -fix -list -dir segments
the sum of results is: 11 million pages - 'dedup' removes 2 million = 9 
million pages.

When I search in the frontend with "http" the result is 6 million, how to 
find the missing 3 million pages?

How to count the total number of searchable pages in the search 

Best Regards,

View raw message