lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Klaas <mike.kl...@gmail.com>
Subject Re: thoughts/suggestions for analyzing/tokenizing class names
Date Mon, 17 Dec 2007 17:28:49 GMT
On 15-Dec-07, at 3:14 PM, Beyer,Nathan wrote:

> I have a few fields that use package names and class names and I've  
> been
> looking for some suggestions for analyzing these fields.
>
> A few examples -
>
> Text (class name)
> - "org.apache.lucene.document.Document"
> Queries that would match
> - "org.apache" , "org.apache.lucene.document"
>
> Text (class name + method signature)
> -- "org.apache.lucene.document.Document#add(Fieldable)"
> Queries that would match
> -- "org.apache.lucene", "org.apache.lucene.document.Document#add"
>
> Any thoughts on how to approach tokenizing these types of texts?

Perhaps it would help to include some examples of queries you _don't_  
want to match.  For all the examples above, simply tokenizing  
alphanumeric components would suffice.

-Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message