nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cam Bazz <>
Subject parser warnings
Date Mon, 18 Jul 2011 23:04:21 GMT
What does the following log mean:

2011-07-19 01:00:07,034 WARN  parse.ParserFactory -
ParserFactory:Plugin: org.apache.nutch.parse.html.HtmlParser mapped to
contentType application/xhtml+xml via parse-plugins.xml, but its
plugin.xml file does not claim to support contentType:

Does that mean that my html parser is not getting part of the crawled data?


View raw message