lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajive Dave <hardtopm...@yahoo.com>
Subject Re: Observations: profiling indexing process
Date Wed, 20 Nov 2002 16:52:47 GMT
The tokenizer we have is pretty straight forward
implementation specific to our grammer . It is just a
tewaked version of CharStream.java already checked in.
I don't think its of any general use and hence non
contributable.

Anyway the point is yes Javacc has substantial
overhead.

Rajive

--- Otis Gospodnetic <otis_gospodnetic@yahoo.com>
wrote:
> Non-contributable?
> The impl. is just Java, no other alternative parser
> tools like ANTLR or
> some such?
> 
> Otis
> 
> --- Rajive Dave <hardtopmerc@yahoo.com> wrote:
> > Yep we replaced javacc with our home grown
> tokenizer.
> > I think we gained almost 100% indexing speed
> because
> > our document size is rather large. 
> > 
> > Rajive
> > 
> > --- Otis Gospodnetic <otis_gospodnetic@yahoo.com>
> > wrote:
> > > Hello,
> > > 
> > > I decided to run a little Lucene app that does
> some
> > > indexing under a
> > > profiler. (I used JMP,
> > > http://www.khelekore.org/jmp/, a rather simple
> > > one).
> > > 
> > > The app uses StandardAnalyzer.
> > > I've noticed that a lot of time is spent in
> > > StandardTokenizer and
> > > various JavaCC-generated methods.
> > > I am wondering if anyone tried replacing
> > > StandardTokenizer.jj with
> > > something more efficient?
> > > 
> > > Also,StopFilter is using a Hashtable to store
> the
> > > list of stop words. 
> > > Has anyone tried using HashMap instead?
> > > 
> > > Thanks,
> > > Otis
> > > 
> > > 
> > >
> __________________________________________________
> > > Do you Yahoo!?
> > > Yahoo! Web Hosting - Let the expert host your
> site
> > > http://webhosting.yahoo.com
> > > 
> > > --
> > > To unsubscribe, e-mail:  
> > >
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> > > For additional commands, e-mail:
> > > <mailto:lucene-dev-help@jakarta.apache.org>
> > > 
> > 
> > 
> > __________________________________________________
> > Do you Yahoo!?
> > Yahoo! Web Hosting - Let the expert host your site
> > http://webhosting.yahoo.com
> > 
> > --
> > To unsubscribe, e-mail:  
> > <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
> > <mailto:lucene-dev-help@jakarta.apache.org>
> > 
> 
> 
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Web Hosting - Let the expert host your site
> http://webhosting.yahoo.com
> 
> --
> To unsubscribe, e-mail:  
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Web Hosting - Let the expert host your site
http://webhosting.yahoo.com

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message