lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Burton-West <tburt...@umich.edu>
Subject Re: Lucene 4.3.1 CheckIndex limitation 100 trillion tokens?
Date Fri, 09 Aug 2013 15:15:20 GMT
Hi Robert,

Thanks for the fix.  Checkindex finished within 24 hours, which is not
terrible, given the size of this index (about a terabyte)..

Tom

Opening index @ /htsolr/lss-dev/solrs/4.2/3/core/data/index

Segments file=segments_e numSegments=2 version=4.2.1 format=
userData={commitTimeMSec=1374712392103}
  1 of 2: name=_bch docCount=82946896
    codec=Lucene42
    compound=false
    numFiles=12
    size (MB)=752,005.689
    diagnostics = {timestamp=1374657630506, os=Linux,
os.version=2.6.18-348.12.1.el5, mergeFactor=16, source=merge,
lucene.version=4.2.1 1461071 - mark - 2013-03-26 08:23:34, os.arch=amd64,
mergeMaxNumSegments=2, java.version=1.6.0_16, java.vendor=Sun Microsystems
Inc.}
    no deletions
    test: open reader.........OK
    test: fields..............OK [12 fields]
    test: field norms.........OK [3 fields]
    test: terms, freq, prox...OK [2442919802 terms; 73922320413 terms/docs
pairs; 109976572432 tokens]
    test: stored fields.......OK [960417844 total field count; avg 11.579
fields per doc]
    test: term vectors........OK [81452262 total vector count; avg 1
term/freq vector fields per doc]
    test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC;
0 SORTED; 0 SORTED_SET]

  2 of 2: name=_bcg docCount=42021835
    codec=Lucene42
    compound=false
    numFiles=12
    size (MB)=371,991.272
    diagnostics = {timestamp=1374612106174, os=Linux,
os.version=2.6.18-348.12.1.el5, mergeFactor=30, source=merge,
lucene.version=4.2.1 1461071 - mark - 2013-03-26 08:23:34, os.arch=amd64,
mergeMaxNumSegments=2, java.version=1.6.0_16, java.vendor=Sun Microsystems
Inc.}
    no deletions
    test: open reader.........OK
    test: fields..............OK [12 fields]
    test: field norms.........OK [3 fields]
    test: terms, freq, prox...OK [1435132736 terms; 36134595066 terms/docs
pairs; 53691487260 tokens]
    test: stored fields.......OK [483146935 total field count; avg 11.498
fields per doc]
    test: term vectors........OK [41299979 total vector count; avg 1
term/freq vector fields per doc]
    test: docvalues...........OK [0 docvalues fields; 0 BINARY; 0 NUMERIC;
0 SORTED; 0 SORTED_SET]

No problems were detected with this index.




On Thu, Aug 8, 2013 at 11:24 AM, Robert Muir <rcmuir@gmail.com> wrote:

> On Thu, Aug 8, 2013 at 11:18 AM, Tom Burton-West <tburtonw@umich.edu>
> wrote:
> > Sure I should be able to build a lucene core and give it a try.  I
> probably
> > won't run it until tomorrow night though because right now I'm running
> some
> > other tests on the machine I would run CheckIndex from and disk I/O (i.e.
> > CheckIndex) would mess with the tests.
> >
> > Do I just need to check out revision  1511014  from branch 4x and build
> it?
> >
> >
>
> yes, something like:
>
> svn co -r 1511014
> https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x
> cd branch_4x/lucene
> ant
>
> this will create a lucene-core-4.5-SNAPSHOT.jar in build/core
>
> Thanks!
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message