lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Serebrennikov <dmit...@earthlink.net>
Subject Re: multiple fields to be indexed
Date Thu, 17 Jun 2004 17:36:48 GMT
Jitender,

I hope you don't mind if I move this thread to the user list.

A "term" in lucene specifies not only the text string, but also a field 
in which it occurs. In the query parser terms are notated as <fieldname> 
":" <text>, so "name:blue" will be a string 'blue' in field 'name', 
whereas "desc:blue" will be the same string but in the 'desc' field. So 
if you type something like the following into the query parser you will 
have a query that matches occurances of 'blue' in either field:
    "name:blue desc:blue"
If you want to match documents that have 'blue' in both fields, the 
query would look like this:
    "+name:blue +desc:blue"

There are other notations that may also be useful. A good explanation of 
the query parser can be found here:
http://jakarta.apache.org/lucene/docs/queryparsersyntax.html.

In the API, the same thing can be achieved with two TermQuery objects 
joined together with a BooleanQuery. For example:
    TermQuery tq1 = new TermQuery( new Term("name", "blue"));
    TermQuery tq2 = new TermQuery( new Term("desc", "blue"));
    BooleanQuery q = new BooleanQuery();
    q.add(tq1, false, false);
    q.add(tq2, false, false);

    // At this point, query q will match documents that have the word 
'blue' in either 'name' or 'desc' field.
    // So this is essentially an "or" query. If you want an "and" query, 
change the last two lines as follows:
    //   q.add(tq1, true, false);
    //   q.add(tq2, true, false);


Does this make more sense now?
The boolean logic at the API level is a bit unusual, if you are used to 
relational databases. Lucene is more like Google in this respect, rather 
than an SQL database. A boolean query component can be either required, 
prohibited, or neither. If it is required (+ prefix in the query 
parser), it MUST match for the entire query to match. If it is 
prohibited (- prefix in the query parser), it MUST NOT match for the 
query to match. If it is neither (no prefix in the query parser), it is 
not required to match for the query to match, provided some other 
component of the query does. This last one may seem useless, except that 
if this query component does match, the score will be boosted. So 
documents that do match this component of the query will be returned 
ahead of those that do not.

Good luck!
Dmitry


jitender ahuja wrote:

>Hi,
>    Can u pl. clarify some more how to use the BooleanQuery class as I am
>clueless still.
>As far as I can gather it deals with multiple query terms and not with
>searching a (may be single) given query term in multiple (indexed) fields.
>
>Regards,
>Jitender
>----- Original Message ----- 
>From: "Dmitry Serebrennikov" <dmitrys@earthlink.net>
>To: "Lucene Developers List" <lucene-dev@jakarta.apache.org>
>Sent: Monday, June 14, 2004 11:02 PM
>Subject: Re: multiple fields to be indexed
>
>
>  
>
>>jitender ahuja wrote:
>>
>>    
>>
>>>Hi all,
>>>
>>>I am trying to do a search on multiple fields of which some are indexed.
>>>      
>>>
>Now, a query can be posed to search in the indexed fields but the Hits class
>object can have only one query object and the query class similarly can have
>only one field on which the query is performed.
>  
>
>>Take a look at BooleanQuery. That will allow you to combine multiple
>>TermQueries, such that you can create a single query on different
>>fields. That should do the trick.
>>
>>Good luck!
>>Dmitry.
>>
>>    
>>
>>>Now, to have multiple queries I need to obtain multiple Hits class
>>>      
>>>
>objects that are then merged  together in one array. But, I do not know how
>to maintain the scorings obtained by different Hits class objects. If
>somebody has tried such a thing or a different mechanism to feed a query to
>different query objects, then pl. respond.
>  
>
>>>Actually I have not tried the above option of merging the results in a
>>>      
>>>
>final array but I feel that it should work.
>  
>
>>>Regards
>>>Jitender
>>>
>>>
>>>      
>>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>>
>>
>>    
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
>  
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message