nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cube Agen <agen....@gmail.com>
Subject why nutch 1.4 don't set the origin html content field in solrindexer
Date Wed, 28 Dec 2011 14:29:07 GMT
When I use sorlindex command post the crawled content. I can find the
content field that is Parsed text. Why not have the raw content field?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message