lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: Lucene 4.3.1 CheckIndex limitation 100 trillion tokens?
Date Thu, 08 Aug 2013 14:51:21 GMT
Hi Tom, I committed a fix for the root cause
(https://issues.apache.org/jira/browse/LUCENE-5156).

Thanks for reporting this!

I dont know if its feasible for you to build a lucene-core.jar from
branch_4x and run checkindex with that jar file to confirm it really
addresses the issue: if this is possible in any way it would be
fantastic.

There is nothing wrong with your index: its just a code thing :)

On Thu, Aug 8, 2013 at 10:45 AM, Tom Burton-West <tburtonw@umich.edu> wrote:
> Hi Robert,
>
> I've been running CheckIndex for over a week and it is still working
> through seekCeil()
> (See below.)
>
> I'm going to kill the CheckIndex.   Admittedly, this index is an unusual
> one, but at one point we were considering using MLT in our regular index
> which would result in a large termvectors file, although only about 800,000
> docs per index.  Should we expect to see something similar or with the two
> orders of magnitude decrease in the number of docs, might CheckIndex work a
> bit faster?
>
> Tom
>
>
>
> ---------------------------
>
> Started CheckIndex on Tuesday July 30 and it wrote the following to STDOUT:
> Opening index @ /htsolr/lss-dev/solrs/4.2/3/core/data/index
>
> Segments file=segments_e numSegments=2 version=4.2.1 format=
> userData={commitTimeMSec=1374712392103}
>   1 of 2: name=_bch docCount=82946896
>     codec=Lucene42
>     compound=false
>     numFiles=12
>     size (MB)=752,005.689
>     diagnostics = {timestamp=1374657630506, os=Linux,
> os.version=2.6.18-348.12.1.el5, mergeFactor=16, source=merge,
> lucene.version=4.2.1 1461071 - mark - 2013-03-26 08:23:34, os.arch=amd64,
> mergeMaxNumSegments=2, java.version=1.6.0_16, java.vendor=Sun Microsystems
> Inc.}
>     no deletions
>     test: open reader.........OK
>     test: fields..............OK [12 fields]
>     test: field norms.........OK [3 fields]
>     test: terms, freq, prox...OK [2442919802 terms; 73922320413 terms/docs
> pairs; 109976572432 tokens]
>     test: stored fields.......OK [960417844 total field count; avg 11.579
> fields per doc]
>     test: term vectors........[tburtonw@alamo 3]$

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message