lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Indexing multiple instances of the same field for each document
Date Fri, 27 Feb 2004 23:17:45 GMT
I think it's document.add().  Fields are pushed onto the front, rather 
than added to the end.

Doug

Roy Klein wrote:
> I think it's got something to do with Document.invertDocument().
> 
> When I reverse the words in the phrase, the other document matches the
> phrase query.
> 
>     Roy
> 
>    
> 
> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
> Sent: Friday, February 27, 2004 4:34 PM
> To: Lucene Users List
> Subject: Re: Indexing multiple instances of the same field for each
> document
> 
> 
> On Feb 27, 2004, at 4:10 PM, Roy Klein wrote:
> 
>>Hi Erik,
>>
>>While you might be right in this example (using Field.Keyword), I can 
>>see how this would still be a problem in other cases. For instance, if
> 
> 
>>I were adding more than one word at a time in the example I attached.
> 
> 
> I concur that it appears to be a bug.  It is unlikely folks use Lucene 
> like this too much though - there probably are not too many scenarios 
> where combining things into a single String or Reader is a burden.
> 
> I'm interested to know where in the code this oddity occurs so I can 
> understand it more.  I did a brief bit of troubleshooting but haven't 
> figured it out yet.  Something in DocumentWriter I presume.
> 
> 	Erik
> 
> 
> 
> 
>>    Roy
>>
>>
>>-----Original Message-----
>>From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
>>Sent: Friday, February 27, 2004 2:12 PM
>>To: Lucene Users List
>>Subject: Re: Indexing multiple instances of the same field for each 
>>document
>>
>>
>>Roy,
>>
>>On Feb 27, 2004, at 12:12 PM, Roy Klein wrote:
>>
>>>        Document doc = new Document();
>>>        doc.add(Field.Text("contents", "the"));
>>
>>Changing these to Field.Keyword gets it to work.  I'm delving a little
> 
> 
>>bit to understand why, but it seems if you are adding words 
>>individually anyway you'd want them to be untokenized, right?
>>
>>	Erik
>>
>>
>>
>>>        doc.add(Field.Text("contents", "quick"));
>>>        doc.add(Field.Text("contents", "brown"));
>>>        doc.add(Field.Text("contents", "fox"));
>>>        doc.add(Field.Text("contents", "jumped"));
>>>        doc.add(Field.Text("contents", "over"));
>>>        doc.add(Field.Text("contents", "the"));
>>>        doc.add(Field.Text("contents", "lazy"));
>>>        doc.add(Field.Text("contents", "dogs"));
>>>        doc.add(Field.Keyword("docnumber", "1"));
>>>        writer.addDocument(doc);
>>>        doc = new Document();
>>>        doc.add(Field.Text("contents", "the quick brown fox jumped 
>>>over the lazy dogs"));
>>>        doc.add(Field.Keyword("docnumber", "2"));
>>>        writer.addDocument(doc);
>>>        writer.close();
>>>    }
>>>
>>>    public static void query(File indexDir) throws IOException
>>>    {
>>>        Query query = null;
>>>        PhraseQuery pquery = new PhraseQuery();
>>>        Hits hits = null;
>>>
>>>        try {
>>>            query = QueryParser.parse("quick brown", "contents", new 
>>>StandardAnalyzer());
>>>        } catch (Exception qe) {System.out.println(qe.toString());}
>>>        if (query == null) return;
>>>        System.out.println("Query: " + query.toString());
>>>        IndexReader reader = IndexReader.open(indexDir);
>>>        IndexSearcher searcher = new IndexSearcher(reader);
>>>
>>>        hits = searcher.search(query);
>>>        System.out.println("Hits: " + hits.length());
>>>
>>>        for (int i = 0; i < hits.length(); i++)
>>>        {
>>>            System.out.println( hits.doc(i).get("docnumber") + " ");
>>>        }
>>>
>>>
>>>        pquery.add(new Term("contents", "quick"));
>>>        pquery.add(new Term("contents", "brown"));
>>>        System.out.println("PQuery: " + pquery.toString());
>>>        hits = searcher.search(pquery);
>>>        System.out.println("Phrase Hits: " + hits.length());
>>>        for (int i = 0; i < hits.length(); i++)
>>>        {
>>>            System.out.println( hits.doc(i).get("docnumber") + " ");
>>>        }
>>>
>>>        searcher.close();
>>>        reader.close();
>>>
>>>    }
>>>    public static void main(String[] args) throws Exception {
>>>        if (args.length != 1) {
>>>            throw new Exception("Usage: " + test.class.getName() + " 
>>><index dir>");
>>>        }
>>>        File indexDir = new File(args[0]);
>>>        test(indexDir);
>>>        query(indexDir);
>>>    }
>>>}
>>>
>>>---------------------------------------------------------------------
>>>-
>>>-
>>>-
>>>-------
>>>My results:
>>>Query: contents:quick contents:brown
>>>Hits: 2
>>>1
>>>2
>>>PQuery:
>>>contents:"quick brown"
>>>Phrase Hits: 1
>>>2
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message