lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@blackwell.co.uk>
Subject Re: Indexing a Database && Spanish
Date Thu, 08 Nov 2001 16:32:59 GMT
> ...
>                 -I've tried a sample that index a Web Site (all the html files) but,
now I
> would like to mix in the same index, information from a directory and
> information from a database. Is it possible??? Is there a DatabaseDocument
> like a HTMLDocument??? Does anyone have a sample? Does anyone tried?

It is possible. Lucene neither knows nor cares where the information
comes from in the first place.

How about
  Document htmld = getDocFromHtml();
  writer.addDocument(htmld);
  Document dbd = getDocumentFromDB();
  writer.addDocument(dbd);

where getDocumentFromDB() will read whatever info you want
from your database and load it into a Lucene Document.


>                 -I would like to index spanish information, is it optimized with the
> StandardAnalyzer?? Have I to create an org.apache.lucene.analysis.es?? like
> org.apache.lucene.analysis.de for german???? Is there anyone in spanish?

StandardAnalyzer used StandardTokenizer and the javadocs for that say
"This should be a good tokenizer for most European-language documents"
but I've no personal experience of using if for any languages other
than English.



--
Ian.
ian.lea@blackwell.co.uk

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message