lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <msoko...@safaribooksonline.com>
Subject Re: Payload Matching Query
Date Sun, 23 Jun 2013 18:49:31 GMT
On 6/21/13 11:18 AM, Uwe Schindler wrote:
> You may also be interested in this talk @ BerlinBuzzwords2013: http://intrafind.de/tl_files/documents/INTRAFIND_BerlinBuzzwords2013_The-Typed-Index.pdf
>
> Unfortunately the slides are not available.
>
> Uwe
>
I've been wondering why we seem to handle case- and 
diacritic-normalization (among other things, like stemming) using 
multiple fields when really it would be more compact to index normalized 
terms in the same position as their base term in a single field.  The 
missing piece of course is how to exclude the normalized terms when you 
want to.  IE - it would be great to have a single text field with terms 
reflecting a variety of different analysis options, plus the ability to 
search the terms selectively (by type) at query time, so that you could 
do (say) a case-sensitive, unstemmed query using the same field as a 
case-insensitive stemmed query, and even intermingle such query terms in 
a single query with a positional (NEAR, or phrase) relationship.  
Wouldn't that be nice?

It sounds like that might be the topic of that paper?  I would be 
interested in the proposed solution, but perhaps it is proprietary? I 
guess payloads are the only place such type information can be stored, 
although I'm fuzzy on that. I wonder if anyone has contributed such a 
thing to Lucene?

-Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message