lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Field boosting Was: Indexing multiple instances of the same field for each document
Date Sat, 28 Feb 2004 03:47:51 GMT
On Feb 27, 2004, at 6:26 PM, Stephane James Vaucher wrote:
> Slightly off topic to this thread, but how would adding different  
> fields
> with the same name deal with boosts? I've looked at the javadoc and  
> FAQ,
> but I think it's not a common use of this feature, any insight?

There is only one boost per field name.  However, the effect is the  
multiplication of them all interestingly.  So, in your example below,  
the boost of the "fieldName" is 2.

	Erik

>
> E.G.
> Document doc = new Document();
> Field f1 = Field.Keyword("fieldName", "foo");
> f1.setBoost(1);
> doc.add(f1);
>
> Field f2 = Field.Keyword("fieldName", "bar");
> f2.setBoost(2);
> doc.add(f2);
>
> Cheers,
> sv
>
> On Fri, 27 Feb 2004, Doug Cutting wrote:
>
>> I think it's document.add().  Fields are pushed onto the front, rather
>> than added to the end.
>>
>> Doug
>>
>> Roy Klein wrote:
>>> I think it's got something to do with Document.invertDocument().
>>>
>>> When I reverse the words in the phrase, the other document matches  
>>> the
>>> phrase query.
>>>
>>>     Roy
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
>>> Sent: Friday, February 27, 2004 4:34 PM
>>> To: Lucene Users List
>>> Subject: Re: Indexing multiple instances of the same field for each
>>> document
>>>
>>>
>>> On Feb 27, 2004, at 4:10 PM, Roy Klein wrote:
>>>
>>>> Hi Erik,
>>>>
>>>> While you might be right in this example (using Field.Keyword), I  
>>>> can
>>>> see how this would still be a problem in other cases. For instance,  
>>>> if
>>>
>>>
>>>> I were adding more than one word at a time in the example I  
>>>> attached.
>>>
>>>
>>> I concur that it appears to be a bug.  It is unlikely folks use  
>>> Lucene
>>> like this too much though - there probably are not too many scenarios
>>> where combining things into a single String or Reader is a burden.
>>>
>>> I'm interested to know where in the code this oddity occurs so I can
>>> understand it more.  I did a brief bit of troubleshooting but haven't
>>> figured it out yet.  Something in DocumentWriter I presume.
>>>
>>> 	Erik
>>>
>>>
>>>
>>>
>>>>    Roy
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
>>>> Sent: Friday, February 27, 2004 2:12 PM
>>>> To: Lucene Users List
>>>> Subject: Re: Indexing multiple instances of the same field for each
>>>> document
>>>>
>>>>
>>>> Roy,
>>>>
>>>> On Feb 27, 2004, at 12:12 PM, Roy Klein wrote:
>>>>
>>>>>        Document doc = new Document();
>>>>>        doc.add(Field.Text("contents", "the"));
>>>>
>>>> Changing these to Field.Keyword gets it to work.  I'm delving a  
>>>> little
>>>
>>>
>>>> bit to understand why, but it seems if you are adding words
>>>> individually anyway you'd want them to be untokenized, right?
>>>>
>>>> 	Erik
>>>>
>>>>
>>>>
>>>>>        doc.add(Field.Text("contents", "quick"));
>>>>>        doc.add(Field.Text("contents", "brown"));
>>>>>        doc.add(Field.Text("contents", "fox"));
>>>>>        doc.add(Field.Text("contents", "jumped"));
>>>>>        doc.add(Field.Text("contents", "over"));
>>>>>        doc.add(Field.Text("contents", "the"));
>>>>>        doc.add(Field.Text("contents", "lazy"));
>>>>>        doc.add(Field.Text("contents", "dogs"));
>>>>>        doc.add(Field.Keyword("docnumber", "1"));
>>>>>        writer.addDocument(doc);
>>>>>        doc = new Document();
>>>>>        doc.add(Field.Text("contents", "the quick brown fox jumped
>>>>> over the lazy dogs"));
>>>>>        doc.add(Field.Keyword("docnumber", "2"));
>>>>>        writer.addDocument(doc);
>>>>>        writer.close();
>>>>>    }
>>>>>
>>>>>    public static void query(File indexDir) throws IOException
>>>>>    {
>>>>>        Query query = null;
>>>>>        PhraseQuery pquery = new PhraseQuery();
>>>>>        Hits hits = null;
>>>>>
>>>>>        try {
>>>>>            query = QueryParser.parse("quick brown", "contents", new
>>>>> StandardAnalyzer());
>>>>>        } catch (Exception qe) {System.out.println(qe.toString());}
>>>>>        if (query == null) return;
>>>>>        System.out.println("Query: " + query.toString());
>>>>>        IndexReader reader = IndexReader.open(indexDir);
>>>>>        IndexSearcher searcher = new IndexSearcher(reader);
>>>>>
>>>>>        hits = searcher.search(query);
>>>>>        System.out.println("Hits: " + hits.length());
>>>>>
>>>>>        for (int i = 0; i < hits.length(); i++)
>>>>>        {
>>>>>            System.out.println( hits.doc(i).get("docnumber") + " ");
>>>>>        }
>>>>>
>>>>>
>>>>>        pquery.add(new Term("contents", "quick"));
>>>>>        pquery.add(new Term("contents", "brown"));
>>>>>        System.out.println("PQuery: " + pquery.toString());
>>>>>        hits = searcher.search(pquery);
>>>>>        System.out.println("Phrase Hits: " + hits.length());
>>>>>        for (int i = 0; i < hits.length(); i++)
>>>>>        {
>>>>>            System.out.println( hits.doc(i).get("docnumber") + " ");
>>>>>        }
>>>>>
>>>>>        searcher.close();
>>>>>        reader.close();
>>>>>
>>>>>    }
>>>>>    public static void main(String[] args) throws Exception {
>>>>>        if (args.length != 1) {
>>>>>            throw new Exception("Usage: " + test.class.getName() + "
>>>>> <index dir>");
>>>>>        }
>>>>>        File indexDir = new File(args[0]);
>>>>>        test(indexDir);
>>>>>        query(indexDir);
>>>>>    }
>>>>> }
>>>>>
>>>>> ------------------------------------------------------------------- 
>>>>> --
>>>>> -
>>>>> -
>>>>> -
>>>>> -------
>>>>> My results:
>>>>> Query: contents:quick contents:brown
>>>>> Hits: 2
>>>>> 1
>>>>> 2
>>>>> PQuery:
>>>>> contents:"quick brown"
>>>>> Phrase Hits: 1
>>>>> 2
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------- 
>>>>> --
>>>>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>>>> For additional commands, e-mail:  
>>>>> lucene-user-help@jakarta.apache.org
>>>>
>>>>
>>>> -------------------------------------------------------------------- 
>>>> -
>>>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>>>
>>>>
>>>> -------------------------------------------------------------------- 
>>>> -
>>>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message