Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Message-ID: <20040223114943.46562.qmail@web25010.mail.ukl.yahoo.com>
Date: Mon, 23 Feb 2004 11:49:43 +0000 (GMT)
From: =?iso-8859-1?q?Clandes=20Tino?= <clandestino_bgd@yahoo.co.uk>
Subject: Multilanguage and wildcard support
To: lucene-user@jakarta.apache.org
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit

Hi, all.
I would like to describe my dilemma about analyzing
stuff.

2. Multilanguage and wildcard support
In Lucene 1.3 Final I have found very useful class
PerFieldAnalyzerWrapper, which helped me to specify
separate analyzer for each field.
But, full text content - obtained after parsing word,
excel, xml or other kind of document) should be
searchable using stemming capabilities and also should
support wildcard queries.
I implemented this solution:
- indexing of full content is done in two separate
fields, because wildcard queries do not pass through
analyzer, as I have read in this mailing archive.
Field1 (�stemmingbody�) - matching snowball analyzer
is used.
Field2 (�plainbody�) - Whitespace analyzer is used.
So, when user searches for some term in item�s
content, I parse the query and if it contains wild
character, search in "plainbody" is performed;
otherwise I search in "stemmingbody", expecting better
search results, that way.
Is there a better way to do this, e.g. not to index
full content in two separate fields, but only one (I
tokenize it, index it, but not store it)?

Thanks for any opinion or suggestion in advance!
Best regards
Milan Agatonovic 


___________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping" 
your friends today! Download Messenger Now 
http://uk.messenger.yahoo.com/download/index.html

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org