lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Clemens Marschner" <c...@lanlab.de>
Subject Re: Lucene Site Updated
Date Wed, 30 Oct 2002 19:41:15 GMT
Thanks Peter for the update,

for those of you who read the doc on the LARM crawler: I added some new
sections, reflecting the changed state of the CVS version of the crawler:

- command line options were extended such that a list of URLs can now be
transmitted, not only one.
- URL normalization was added. See the new section on that
- the HostResolver was divided from the HostManager (in order to make it
more resusable) and can now be configured from the command line

The crawler still is in fact a framework with a "reference implementation"
in the class FetcherMain. You can change many options only by editing this
main class. You can see an example there on how Lucene can be made a storage
for the crawler.

--Clemens


----- Original Message -----
From: "Peter Carlson" <carlson@bookandhammer.com>
To: "Lucene Developers List" <lucene-dev@jakarta.apache.org>; "List Lucene
Users" <lucene-user@jakarta.apache.org>
Sent: Wednesday, October 30, 2002 8:12 PM
Subject: Lucene Site Updated


> The Lucene Website has been updated with some new content.
>
> A Lucene File Format that was originally just an attachement from Doug
> Cutting was transformed to xml/html by Otis Gospodnetic. This document
> is available in the left hand nav area.
>
> Also, the LARM project now has html documentation. Originally by
> Clemens Marschener  in PDF, .doc format, it is now in xml/html format
> by Otis Gospodnetic and is available under lucene-sandbox/larm in left
> hand nav area.
>
> Please let us know if there are any issues with the site.
>
> Thanks
>
> --Peter
>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-dev-help@jakarta.apache.org>
>


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message