lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Satuluri, Venu_Madhav" <Venu.Madhav.Satul...@deshaw.com>
Subject The fields in my document aren't getting analyzed
Date Thu, 08 Jun 2006 16:38:01 GMT
Hi all,

It seems to me my Fields aren't getting analyzed before they are stored
in the index. I am sure I am overlooking some obvious point here, but
cant figure out what that is. I recently migrated to Lucene2.0 from
Lucene 1.4.3, and my fields used to get indexed earlier, so maybe I am
missing something.

Code I use for making a new field:
new Field(desc, value, Field.Store.YES, Field.Index.TOKENIZED );

Code I use for opening the IndexWriter:
writer = new IndexWriter(indexLocation,new MyAnalyzer(), false);

I know the Analyzer is being invoked because I have my own Analyzer
(which is sort of like the PerFieldAnalyzerWrapper) and I have debug
statements in its tokenStream(fieldName, reader) method that do get
printed. When I test the Analyzer separately, it analyzes as expected.

Here is the code for MyAnalyzer.tokenStream();
    public TokenStream tokenStream(String fieldName, Reader reader)
    {
    	if ( ContactIndexer.getKeyWordFields().contains( fieldName ) )
        {
    	    LOG.debug( "fieldName:" + fieldName + " is a keyword" );
            return new LowerCaseFilter( new KeywordTokenizer( reader )
);
        }
        
        LOG.debug( "fieldName:" + fieldName + " is not a keyword" );
        return new PorterStemFilter(
    	         new StopFilter(
                    new LowerCaseFilter(
                        new StandardFilter(
                           new StandardTokenizer( reader ) ) ) 
                                 , StandardAnalyzer.STOP_WORDS ) );
        
    } 

Any help hugely appreciated.

Venu




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message