nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject Re: Which outlinks on a webpage are stored in the segment?
Date Sun, 04 Dec 2011 13:24:12 GMT
only those that pass the filters.Check the code in ParseOutputFormat, it 
throws away URL's before it adds them to ParseData.


> Hello everybody,
> 
> just a short questions:
> 
> Are all outlinks saved in the segement dir or only the ones that pass
> the regex-ulrfilter?
> 
> thanks

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message