lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From luc...@gobstopper.demon.co.uk
Subject Re: NLucene up to date ?
Date Thu, 31 Jul 2003 15:41:55 GMT
Replies to Erik and Scott inline.


scott.ganyo@etapestry.com wrote:
> Do these implementations maintain file compatibility with the Java version?
> 
> Scott

Yes and no, explanation will help me explain.

The field ordering functionality required additional files to be created at index time if
the Document.Field property indicates so.
At search time, the entire contents of the 'field sorting' files are read in.  As the IndexReader
is shared for all client calls (for a pre-defined period of time as the index has been implemented
'incremental' style) this cost is only incurred once.

Code-wise, the technique follows the pattern for the Normalisation byte writing and reading,
the difference being an Int being written.  Yes, there is a memory usage hit, but the performance
and functionality offered offsets this.

All other file formats remain identical.
I have coded LuceNET (!) so that it gracefully continues if the index segments do not have
these additional 'sorting' files (naming convention like the normalisation files).


> Erik Hatcher wrote:
> 
> > I'd love to see there be quality implementations of the Lucene API in 
> > other languages, that are up to date with the latest Java codebase.
> >
> > I'm embarking on a Ruby port, which I'm hosting at rubyforge.org.  
> > There is a Python version called Lupy.
> >
> > A related question I have is what about performance comparisons 
> > between the different language implementations?  Will Java be the 
> > fastest?  Is there a test suite already available that can demonstrate 
> > the performance characteristics of a particular implementation?  I'd 
> > love to see the numbers and see if even the Java version can be beat.
> >
> >     Erik


Performance wise, queries typically run in hundreths of seconds.
Including term position in the scoring impacted the timings as expected.

Indexing takes time, but then this wasn't really part of the design goals.

As far as comparing to the java implementation in terms in performance, I haven't tried as
this workplace is a MS shop.

Java vs c# all over ?  Just kidding =)


> >
> >
> > On Thursday, July 31, 2003, at 08:43  AM, 
> > lucene@gobstopper.demon.co.uk wrote:
> >
> >> Hi all,
> >>
> >> http://sourceforge.net/projects/nlucene/ has a version numbered 1.2b2.
> >> Does anyone know if this source is still being maintained to be 
> >> closer to the java developments ?
> >> Was this an external project to Apache Jakarta ?
> >>
> >> I (we) have just successfully released a search engine using a c# 
> >> implmentation of Lucene.  Code had to be brought up to date in line 
> >> with recent java builds, and enhanced with additional features (eg 
> >> field sorting, term position score factoring, etc).
> >>
> >> Any other c# users who would like to see NLucene kept in line with 
> >> the java version ?
> >>
> >> Maybe I'm just being lazy with having to maintain my own version of 
> >> Lucene =).
> >> Surely there are others out there who are c# users and follow the 
> >> mailing lists (I remember a Brian somewhere !) but seldom post.
> >>
> >> Brendon
> >
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message