lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum <anshum.gu...@naukri.com>
Subject Re: Search in non-linguistic text
Date Thu, 16 Jul 2009 12:58:29 GMT
Hi Jes,

This is a lucene developer mailing list meant (instead of java user
mailing list). Perhaps you should mail this to java user group.


On Thu, Jul 16, 2009 at 06:20:57PM +0530, JesL wrote:
> 
> Hello,
> Are there any suggestions / best practices for using Lucene for searching
> non-linguistic text?  What I mean by non-linguistic is that it's not English
> or any other language, but rather product codes.  This is presenting some
> interesting challenges.  Among them are the need for pretty lax wildcard
> searches.  For example, ABC should match on ABCD, but so should BCD.  Also,
> it needs to be agnostic to special characters.  So, ABC/D should match ABCD
> as well as ABC-D or "ABC D".
> 
> As I write an analyzer to handle these cases, I seem to be pretty quickly
> degrading into a "like '%blah%' search, with rules to treat all special
> characters as single-character, optional wildcards.  I'm concerned that the
> performance of this will be disappointing, though.
> 
> Any help would be much appreciated.  Thanks!
> 
> - Jes
> -- 
> View this message in context: http://www.nabble.com/Search-in-non-linguistic-text-tp24515712p24515712.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org

-- 
Anshum 
--
question = ( to ) ? be : ! be;
		-- Wm. Shakespeare

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message