lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mufaddal Khumri <>
Subject Re: exact match ..
Date Mon, 20 Feb 2006 18:22:59 GMT

Just realized that the various fields I have are part of the same 
document. But in order to leverage the KeywordAnalyzer, I would have to 
now have two sets of document.
One document with the fields: title, content <--- analyzed by custom 
Other document with the fields: categoryNames < ---- analyzed by keyword 

Is there a way I could have a single document object have some fields 
analyzed by my custom analyzer and the one field - "categoryNames" 
analyzed by the keyword analyzer?


Mufaddal Khumri wrote:

> Hi Steve,
> If I understand you right, I could use something like the Keyword 
> analyzer to tokenize the entire stream as a single token and store 
> that in the index. I could definitely the keyword analyzer while 
> indexing this particular field "categoryNames".
> Now my questions is on how to search and boost this since this is part 
> of a bigger boolean query in my case.
> My typical query actually looks like:
> +(+content:digit +content:camera) +entity:product +(title:"digit 
> camera"~2^40.0 ((title:digit title:camera)^10.0) content:"digit 
> camera"~2^20.0 (content:digit content:camera) categoryNames:"digit 
> camera"^80.0)
> As you can see i was trying to do a phrase query on the categoryNames 
> field and boosting it by 80.0.
> Also I am using the potter stemming filter to stem while searching. (I 
> do this while indexing as well). If I go with the KeywordAnalyzer 
> approach I can index the categoryNames field using this analyzer .
> Would I be using the QueryParser to create my query and specify the 
> keyword analyzer to it while searching on categoryNames ? (and then 
> make that query part of my global boolean query?)
> -Thanks.
> Steven Rowe wrote:
>> Mufaddal Khumri wrote:
>>> lets say i do this while indexing:
>>> doc.add(Field.Text("categoryNames", categoryNames));
>>> Now while searching categoryNames, I do a search for "digital 
>>> cameras". I only want to match the exact phrase digital cameras with 
>>> documents who have exactly the phrase "digital cameras" in the 
>>> categoryNames field. I do not want results that have "digital camera 
>>> batteries" part of the result.
>>> Whats the best way to accomplish this?
>> Hi Mufaddal,
>> One way to do this is to use the KeywordAnalyzer (in the Lucene 
>> Subversion trunk, but not in v1.4.3; will be in forthcoming v1.9) for 
>> the "categoryNames" field.  This analyzer does not tokenize field 
>> contents, so "digital cameras" would be a single token, and the only 
>> thing that would match it would be the exact same single token.  Be 
>> careful when you search to construct the search tokens similarly.
>> If you have other fields you want to search, and you want to tokenize 
>> their contents when you index them, you could use the 
>> PerFieldAnalyzerWrapper, so that the KeywordAnalyzer is only used for 
>> the "categoryNames" field.
>> Steve
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message