lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: positional token info
Date Wed, 22 Oct 2003 09:05:46 GMT
On Tuesday, October 21, 2003, at 07:31  PM, Otis Gospodnetic wrote:
> So "phone boy" would match documents containing "phone the boy"?  That
> doesn't sound right to me, as it assumes what the user is trying to do.

That is correct.... currently a match would be found.  Here's a little 
test case I'm working with:

     Directory directory = new RAMDirectory();
     IndexWriter writer = new IndexWriter(directory, new 
StandardAnalyzer(), true);
     Document doc = new Document();
     doc.add(Field.Text("contents", "The quick brown fox jumped over the 
lazy dogs"));
     writer.addDocument(doc);
     writer.close();

     IndexSearcher searcher = new IndexSearcher(directory);
     QueryParser parser = new QueryParser("contents", new 
StandardAnalyzer());
     Query query = parser.parse("\"over lazy\"");

     Hits hits = searcher.search(query);
     assertEquals(1, hits.length());

which currently passes.... although should not I don't think.

>  Wouldn't it be better to allow the user to decide what he wants?
> (i.e. "phone boy" returns documents with that _exact_ phrase.  "phone
> boy"~2 also returns documents containing "phone the boy").

I concur.  StopFilter just removes terms, but does not adjust the 
following acceptable term with the offset to account for the missing 
stop words.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message