lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vanlerberghe, Luc" <>
Subject RE: Document visible by Term, but not search
Date Thu, 25 Aug 2005 08:27:30 GMT
Is your Analyzer aware that that particular field does not need to be

During indexation, if a field is passed that is passed as
tokenize=false, the analyzer won't be called so the string will be
stored as-is.

During searching, the queryparser doesn't know which fields should be
tokenized or not and passes them all to your analyzer.

Your analyzer should pass a KeywordTokenizer when asked for a
TokenStream for that field. It passes the entire string as one token.

It's in the contrib area on svn:


Something like:
public class MyAnalyzer extends Analyzer {
  public TokenStream tokenStream(String fieldName,
                                 final Reader reader) {
    if (fieldName=="myKeywordField") { // fieldNames are "intern"ed so
== can be used.
      return new KeywordTokenizer(reader);
    } else {
      // original analyzer code...

That should also solve the issue in Luke...


-----Original Message-----
From: Fred Toth [] 
Sent: donderdag 25 augustus 2005 4:18
Subject: Re: Document visible by Term, but not search

Hi Dan,

What's the term? Could this be an analyzer problem? Are you using
the same analyzer for indexing and searching?


At 09:06 PM 8/24/2005, you wrote:
>I have the following strange behavior for an index. The index has been
>optimized and has no deletions. It's in compound file format.
>Using Luke 0.6 I can browse by Term and find my term (ItemId:727680).
It's a
>Keyword field.  It shows a docfreq of this term is 1. It also shows all
>document fields including the correct ItemId value. If I build a
>and search for the term I get no results. Similarly, if I click on the
>All Docs button in Luke, I get no results.
>Is my index corrupted? Is there some state or some way of doing a
>search that is making both Luke and my direct query fail?
>One thing that makes me suspicious is that the behavior seems to apply
>the 4 highest lucene docids (each with there own unique term), but not
>earlier docs (as far as I can tell). There are 14,337 docs in this
>Any ideas on what could cause this or how I could construct a search
>finds this document?
>To unsubscribe, e-mail:
>For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message