lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Marks <a85533...@gmail.com>
Subject indexing but not tokenizing
Date Thu, 05 Mar 2009 15:17:37 GMT
Hi all,

I'm not able to see what's wrong in the following sample code.
I'm indexing a document with 5 fields, using five different indexing strategies.
I'm fine the the results for 4 of them, but field B is causing me some
trouble in understanding what's going on.

The value of field B is X (uppercase).
The analyzer is a SimpleAnalyzer, which I use on the QueryParser as well.
But when I search for X (uppercase) on field B, the X is converted to lowercase.
Now, I know that SimpleAnalyzer converts to lowercase, but I was
expecting it not to do so on field B, which is NOT_ANALYZED.

How should I fix my code?

Thank you in advance!
-John



--- code ---


package test;

import org.apache.lucene.analysis.SimpleAnalyzer;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TopDocCollector;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.queryParser.QueryParser;



public class Test
{
  public static void main(String[] args)
  {
    try
    {
      RAMDirectory idx = new RAMDirectory();
      SimpleAnalyzer analyzer = new SimpleAnalyzer();

      IndexWriter writer = new IndexWriter(idx, analyzer, true,
          IndexWriter.MaxFieldLength.LIMITED);

      Document doc = new Document();
      doc.add(new Field("A", "X",
          Field.Store.YES, Field.Index.NO));
      doc.add(new Field("B", "X",
          Field.Store.YES, Field.Index.NOT_ANALYZED));
      doc.add(new Field("C", "X",
          Field.Store.YES, Field.Index.ANALYZED));
      doc.add(new Field("D", "x",
          Field.Store.NO, Field.Index.NOT_ANALYZED));
      doc.add(new Field("E", "X",
          Field.Store.NO, Field.Index.ANALYZED));
      writer.addDocument(doc);
      writer.close();

      IndexSearcher searcher = new IndexSearcher(idx);
      String field = "B";
      QueryParser parser = new QueryParser(field, analyzer);
      Query query = parser.parse("X");
      System.out.println("Query: " + query.toString());

      TopDocCollector collector = new TopDocCollector(1);
      searcher.search(query, collector);
      int numHits = collector.getTotalHits();
      System.out.println(numHits + " total matching documents");

      if ( numHits > 0)
      {
        ScoreDoc[] hits = collector.topDocs().scoreDocs;
        doc = searcher.doc(hits[0].doc);
        System.out.println("A: " + doc.get("A"));
        System.out.println("B: " + doc.get("B"));
        System.out.println("C: " + doc.get("C"));
        System.out.println("D: " + doc.get("D"));
        System.out.println("E: " + doc.get("E"));
      }
    }
    catch (Exception e)
    {
      System.out.println(" caught a " + e.getClass() + "\n with message: "
          + e.getMessage());
    }
  }

}

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message