lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: lucene 4.3 seems to be much slower in indexing than lucene 3.6?
Date Thu, 01 Aug 2013 17:45:20 GMT
On Wed, Jul 31, 2013 at 7:17 PM, Zhang, Lisheng
<Lisheng.Zhang@broadvision.com> wrote:
>
> Hi Mike,
>
> I retested and results are the same:
>
> 1/ I did not use sort (so FieldCache should not enter picture?)

No grouping or joining either (they will use FieldCache, if it's not
against a doc values field).

What sort of queries are you running?

> 2/ I created indexed data from scratch separately for 361 and 43
>    based on same text (text files), and I ran test from command
>    line separately against each index folder, so seems a pretty
>    fair test.

OK.

> 3/ Each test I created searcher from scrath (to measure creation
>    time). I did not include JVM start time in each case. The
>    tests are in same box.

OK.

> From indexed data it seems that 43 generated a lot more data in
> folder, below I listed (ls -ltr) result

This is very odd: the 4.3 index is quite a bit larger than the 3.x
index.  Are you certain the two indexed the same content in the same
way?  Which analyzer are you using?  Maybe run CheckIndex against each
index and post the output?

> (always pass in LUCENE_43
> version, so lucen 42 codec should be used, why lucene41?).

This is fine: the Lucene42 codec uses Lucene41PostingsFormat.

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message