lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SBS <jturn...@uow.edu.au>
Subject Overriding default handling of '/' and '-'
Date Tue, 16 Aug 2011 22:15:49 GMT
Our document base includes terms which are in fact codes that may contain
dashes and slashes such as "M1234/5" and "12345-00".  Presently Lucene
appears to breaking up these codes according to the slashes and dashes and
searches are therefore not working properly.  Instead of matching an exact
code of "12345-00", Lucene matches any text containing either "12345" or
"00" which is not desirable.

Is there a way to change this default behaviour (a filter perhaps)?  The
situation is complicated by the fact that the content also includes normal
text where processing of the slashes and dashes in this manner is probably
expected and desirable.  I guess if I turn off this default behaviour then I
will lose it for normal words but that is probably acceptable and
unavoidable.

Thanks,

-sbs

--
View this message in context: http://lucene.472066.n3.nabble.com/Overriding-default-handling-of-and-tp3259987p3259987.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message