The Apache Nutch PMC are extremely pleased to announce the immediate release of Apache Nutch v2.2.
Apache Nutch is an open source web-search
software project. Stemming from Apache Lucene
, it now builds
on Apache Solr
adding web-specifics, such as a crawler,
a link-graph database and parsing support handled by Apache Tika
for HTML and and array other document formats.
release includes over 30 bug fixes and over 25 improvements representing the third release of increasingly
popular 2.x Nutch series. This release features inclusion of
which Nutch now utilizes for improved robots.txt parsing, library upgrades to
1.1.1, Apache Gora
0.3, Apache Tika
1.11-8. Please see the list of
or the release report
made in this version for a full
As usual in the 2.x series, this release is made available only as source, but is also available within
The release is available here
Have a great weekend.