lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernd Fehling <bernd.fehl...@uni-bielefeld.de>
Subject Re: Lucene 4.3.1 CheckIndex limitation 100 trillion tokens?
Date Fri, 09 Aug 2013 06:04:08 GMT
Hi Tom,

I just see that you have Linux with 2.6 kernel.
Have you already -XX:+UseLargePages as performance option enabled and in use?
Solaris 9 has it on by default but with Linux HugePages must be enabled.

http://www.oracle.com/technetwork/java/javase/tech/largememory-jsp-137182.html

Just an idea.

Regards
Bernd


Am 08.08.2013 16:45, schrieb Tom Burton-West:
> Hi Robert,
> 
> I've been running CheckIndex for over a week and it is still working
> through seekCeil()
> (See below.)
> 
> I'm going to kill the CheckIndex.   Admittedly, this index is an unusual
> one, but at one point we were considering using MLT in our regular index
> which would result in a large termvectors file, although only about 800,000
> docs per index.  Should we expect to see something similar or with the two
> orders of magnitude decrease in the number of docs, might CheckIndex work a
> bit faster?
> 
> Tom
> 
> 
> 
> ---------------------------
> 
> Started CheckIndex on Tuesday July 30 and it wrote the following to STDOUT:
> Opening index @ /htsolr/lss-dev/solrs/4.2/3/core/data/index
> 
> Segments file=segments_e numSegments=2 version=4.2.1 format=
> userData={commitTimeMSec=1374712392103}
>   1 of 2: name=_bch docCount=82946896
>     codec=Lucene42
>     compound=false
>     numFiles=12
>     size (MB)=752,005.689
>     diagnostics = {timestamp=1374657630506, os=Linux,
> os.version=2.6.18-348.12.1.el5, mergeFactor=16, source=merge,
> lucene.version=4.2.1 1461071 - mark - 2013-03-26 08:23:34, os.arch=amd64,
> mergeMaxNumSegments=2, java.version=1.6.0_16, java.vendor=Sun Microsystems
> Inc.}
>     no deletions
>     test: open reader.........OK
>     test: fields..............OK [12 fields]
>     test: field norms.........OK [3 fields]
>     test: terms, freq, prox...OK [2442919802 terms; 73922320413 terms/docs
> pairs; 109976572432 tokens]
>     test: stored fields.......OK [960417844 total field count; avg 11.579
> fields per doc]
>     test: term vectors........[tburtonw@alamo 3]$
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message