lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Smith" <PSm...@tenfold.com>
Subject Hungarian notation analyzer and phrase queries
Date Tue, 12 Apr 2005 23:42:23 GMT
I am writing a document management system for my company, and many of
our feature names are in Hungarian notation (PowerQuery,
TransactionManager, etc.). This can make it hard to find some things
with a default analyzer.

I'd like to be able to index text like "Use PowerQuery for advanced
searches" and be able to find it with "use power query for advanced
searches". Note the space between power and query.

I have written a custom analyzer to tokenize PowerQuery into  'power',
'query, and 'powerquery' and change the position increment to 0, but I
don't quite get the desired behavior. The phrase query "use power query
for advanced searches" does not match, but "use query for advanced
searches", and "use power for advanced searches" do.

Any ideas?

I noticed that Dave at tropo.com has written a JavaDocAnalyzer that has
the same problem. Go to searchmorph.com and search for "An instance of
HashMap has two parameters" and "An instance of Hash Map has two
parameters"

I realize that with my custom analyzer I can find it without using a
phrase query, but it would be nice.

Thanks,
Paul

http://www.tenfold.com

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the TenFold Postmaster (postmaster@tenfold.com).
**********************************************************************


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message