lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephane James Vaucher <vauch...@cirano.qc.ca>
Subject Field boosting Was: Indexing multiple instances of the same field for each document
Date Fri, 27 Feb 2004 23:26:55 GMT
Slightly off topic to this thread, but how would adding different fields 
with the same name deal with boosts? I've looked at the javadoc and FAQ, 
but I think it's not a common use of this feature, any insight?

E.G.
Document doc = new Document();
Field f1 = Field.Keyword("fieldName", "foo");
f1.setBoost(1);
doc.add(f1);

Field f2 = Field.Keyword("fieldName", "bar");
f2.setBoost(2);
doc.add(f2);

Cheers,
sv

On Fri, 27 Feb 2004, Doug Cutting wrote:

> I think it's document.add().  Fields are pushed onto the front, rather 
> than added to the end.
> 
> Doug
> 
> Roy Klein wrote:
> > I think it's got something to do with Document.invertDocument().
> > 
> > When I reverse the words in the phrase, the other document matches the
> > phrase query.
> > 
> >     Roy
> > 
> >    
> > 
> > -----Original Message-----
> > From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
> > Sent: Friday, February 27, 2004 4:34 PM
> > To: Lucene Users List
> > Subject: Re: Indexing multiple instances of the same field for each
> > document
> > 
> > 
> > On Feb 27, 2004, at 4:10 PM, Roy Klein wrote:
> > 
> >>Hi Erik,
> >>
> >>While you might be right in this example (using Field.Keyword), I can 
> >>see how this would still be a problem in other cases. For instance, if
> > 
> > 
> >>I were adding more than one word at a time in the example I attached.
> > 
> > 
> > I concur that it appears to be a bug.  It is unlikely folks use Lucene 
> > like this too much though - there probably are not too many scenarios 
> > where combining things into a single String or Reader is a burden.
> > 
> > I'm interested to know where in the code this oddity occurs so I can 
> > understand it more.  I did a brief bit of troubleshooting but haven't 
> > figured it out yet.  Something in DocumentWriter I presume.
> > 
> > 	Erik
> > 
> > 
> > 
> > 
> >>    Roy
> >>
> >>
> >>-----Original Message-----
> >>From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> >>Sent: Friday, February 27, 2004 2:12 PM
> >>To: Lucene Users List
> >>Subject: Re: Indexing multiple instances of the same field for each 
> >>document
> >>
> >>
> >>Roy,
> >>
> >>On Feb 27, 2004, at 12:12 PM, Roy Klein wrote:
> >>
> >>>        Document doc = new Document();
> >>>        doc.add(Field.Text("contents", "the"));
> >>
> >>Changing these to Field.Keyword gets it to work.  I'm delving a little
> > 
> > 
> >>bit to understand why, but it seems if you are adding words 
> >>individually anyway you'd want them to be untokenized, right?
> >>
> >>	Erik
> >>
> >>
> >>
> >>>        doc.add(Field.Text("contents", "quick"));
> >>>        doc.add(Field.Text("contents", "brown"));
> >>>        doc.add(Field.Text("contents", "fox"));
> >>>        doc.add(Field.Text("contents", "jumped"));
> >>>        doc.add(Field.Text("contents", "over"));
> >>>        doc.add(Field.Text("contents", "the"));
> >>>        doc.add(Field.Text("contents", "lazy"));
> >>>        doc.add(Field.Text("contents", "dogs"));
> >>>        doc.add(Field.Keyword("docnumber", "1"));
> >>>        writer.addDocument(doc);
> >>>        doc = new Document();
> >>>        doc.add(Field.Text("contents", "the quick brown fox jumped 
> >>>over the lazy dogs"));
> >>>        doc.add(Field.Keyword("docnumber", "2"));
> >>>        writer.addDocument(doc);
> >>>        writer.close();
> >>>    }
> >>>
> >>>    public static void query(File indexDir) throws IOException
> >>>    {
> >>>        Query query = null;
> >>>        PhraseQuery pquery = new PhraseQuery();
> >>>        Hits hits = null;
> >>>
> >>>        try {
> >>>            query = QueryParser.parse("quick brown", "contents", new 
> >>>StandardAnalyzer());
> >>>        } catch (Exception qe) {System.out.println(qe.toString());}
> >>>        if (query == null) return;
> >>>        System.out.println("Query: " + query.toString());
> >>>        IndexReader reader = IndexReader.open(indexDir);
> >>>        IndexSearcher searcher = new IndexSearcher(reader);
> >>>
> >>>        hits = searcher.search(query);
> >>>        System.out.println("Hits: " + hits.length());
> >>>
> >>>        for (int i = 0; i < hits.length(); i++)
> >>>        {
> >>>            System.out.println( hits.doc(i).get("docnumber") + " ");
> >>>        }
> >>>
> >>>
> >>>        pquery.add(new Term("contents", "quick"));
> >>>        pquery.add(new Term("contents", "brown"));
> >>>        System.out.println("PQuery: " + pquery.toString());
> >>>        hits = searcher.search(pquery);
> >>>        System.out.println("Phrase Hits: " + hits.length());
> >>>        for (int i = 0; i < hits.length(); i++)
> >>>        {
> >>>            System.out.println( hits.doc(i).get("docnumber") + " ");
> >>>        }
> >>>
> >>>        searcher.close();
> >>>        reader.close();
> >>>
> >>>    }
> >>>    public static void main(String[] args) throws Exception {
> >>>        if (args.length != 1) {
> >>>            throw new Exception("Usage: " + test.class.getName() + " 
> >>><index dir>");
> >>>        }
> >>>        File indexDir = new File(args[0]);
> >>>        test(indexDir);
> >>>        query(indexDir);
> >>>    }
> >>>}
> >>>
> >>>---------------------------------------------------------------------
> >>>-
> >>>-
> >>>-
> >>>-------
> >>>My results:
> >>>Query: contents:quick contents:brown
> >>>Hits: 2
> >>>1
> >>>2
> >>>PQuery:
> >>>contents:"quick brown"
> >>>Phrase Hits: 1
> >>>2
> >>>
> >>>
> >>>---------------------------------------------------------------------
> >>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >>
> >>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >>
> >>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message