lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Li" <ning.li...@gmail.com>
Subject Re: Ferret's changes
Date Tue, 10 Oct 2006 16:35:30 GMT
On 10/10/06, Yonik Seeley <yonik@apache.org> wrote:
> On 10/10/06, Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
> > Hi,
> >
> > Maybe I missed it, but I was surprised that nobody here wondered about the algorithm
and data structure changes that Dave Balmain made in Ferret, to make it go faster (than Java
Lucene).
>
> Not using single doc segments for buffered docs has come up
> http://www.nabble.com/-jira--Created%3A-%28LUCENE-565%29-Supporting-deleteDocuments-in-IndexWriter-%28Code-and-Performance-Results-Provided%29-tf1580652.html#a6177808

After reading the interview article, I thought not using single doc
segments contributed most of the indexing performance improvement. But
in the mailing list discussion on "Global field semantics", Dave
Balmain mentioned most of the indexing performance benefits come from
having constant field numbers, which greatly optimizes the merging of
term vectors and stored fields.

Exactly how much performance improvement each of these two
optimizations provides will depend on a workload. But in general, is
one playing a more significant role than the other? What about for the
benchmark workload Yonik pointed out at
http://rubyforge.org/forum/forum.php?forum_id=9058 ?

Cheers,
Ning

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message