lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vanlerberghe, Luc" <Luc.Vanlerber...@bvdep.com>
Subject RE: Document visible by Term, but not search
Date Thu, 25 Aug 2005 08:27:30 GMT
Is your Analyzer aware that that particular field does not need to be
tokenized?

During indexation, if a field is passed that is passed as
tokenize=false, the analyzer won't be called so the string will be
stored as-is.

During searching, the queryparser doesn't know which fields should be
tokenized or not and passes them all to your analyzer.

Your analyzer should pass a KeywordTokenizer when asked for a
TokenStream for that field. It passes the entire string as one token.

It's in the contrib area on svn:
See
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/analyzers/src/
java/org/apache/lucene/analysis/KeywordTokenizer.java

and
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/analyzers/src/
java/org/apache/lucene/analysis/KeywordAnalyzer.java

Something like:
public class MyAnalyzer extends Analyzer {
  public TokenStream tokenStream(String fieldName,
                                 final Reader reader) {
    if (fieldName=="myKeywordField") { // fieldNames are "intern"ed so
== can be used.
      return new KeywordTokenizer(reader);
    } else {
      // original analyzer code...
    }
  }
}

That should also solve the issue in Luke...

Luc


-----Original Message-----
From: Fred Toth [mailto:ftoth@synernet.com] 
Sent: donderdag 25 augustus 2005 4:18
To: java-user@lucene.apache.org
Subject: Re: Document visible by Term, but not search

Hi Dan,

What's the term? Could this be an analyzer problem? Are you using
the same analyzer for indexing and searching?

Fred

At 09:06 PM 8/24/2005, you wrote:
>I have the following strange behavior for an index. The index has been
>optimized and has no deletions. It's in compound file format.
>
>Using Luke 0.6 I can browse by Term and find my term (ItemId:727680).
It's a
>Keyword field.  It shows a docfreq of this term is 1. It also shows all
the
>document fields including the correct ItemId value. If I build a
TermQuery
>and search for the term I get no results. Similarly, if I click on the
Show
>All Docs button in Luke, I get no results.
>Is my index corrupted? Is there some state or some way of doing a
TermQuery
>search that is making both Luke and my direct query fail?
>
>One thing that makes me suspicious is that the behavior seems to apply
to
>the 4 highest lucene docids (each with there own unique term), but not
>earlier docs (as far as I can tell). There are 14,337 docs in this
index.
>
>Any ideas on what could cause this or how I could construct a search
that
>finds this document?
>
>Thanks,
>Dan
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message