lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven A Rowe <sar...@syr.edu>
Subject RE: Search results include results with excluded terms
Date Mon, 16 Aug 2010 16:29:14 GMT
Hi Christoph,

There could be several things going on, but it's difficult to tell without more information.
 

Since excluded terms require a non-empty set from which to remove documents at the same boolean
clause level, you could try something like "title:(*:* -Datei*) avl", or "-title:Datei* avl".

Another possible problem is case.  If you downcase indexed terms, "Datei*" will not match
any of them by default, since no analysis is carried out on wildcard terms.  QueryParser has
a static method setLowercaseExpandedTerms() that you can call to turn on automatic pre-expansion
query term downcasing:

<http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/queryParser/QueryParser.html#setLowercaseExpandedTerms%28boolean%29>

Steve

> -----Original Message-----
> From: Christoph Hermann [mailto:hermann@informatik.uni-freiburg.de]
> Sent: Monday, August 16, 2010 9:32 AM
> To: java-user@lucene.apache.org
> Subject: Search results include results with excluded terms
> 
> Hi,
> 
> i've built a local index of the german wikipedia (works fine so far).
> 
> Now when i'm searching this index with luke (or my own code) using a query
> like "title:(-Datei*) avl" i still get results with Documents where the
> title contains: "Datei:foo".
> 
> The title field is created like this:
> Field fieldTitle = new Field(Metadata.TITLE, title, Field.Store.YES, Field.Index.ANALYZED);
> 
> Can someone explain to me why i still get these results?
> 
> If i click on "explain" in luke, it tells me that the score basically came
> from the contents field where "avl" is included.
> 
> So the question is, how do i *exclude* documents? I.e. score the exclusion
> very low, so that these results won't appear at all?
> 
> regards
> Christoph
> 
> --
> Christoph Hermann
> Institut für Informatik
> Tel: +49 761-203-8171 Fax: +49 761-203-8162
> e-mail: hermann@informatik.uni-freiburg.de
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

Mime
View raw message