lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: bytecount as prefix
Date Tue, 11 Apr 2006 23:49:18 GMT

On Apr 11, 2006, at 12:05 PM, Marvin Humphrey wrote:

>  TestRangeFilter.

A phantom blank Term shows up out of nowhere in the middle of the  
merge process.

When you stick a System.err.println into TermInfosWriter's writeTerm,  
you ordinarily see it adding Terms in proper sort order:

     [junit] TINFO: :
     [junit] TINFO: body:body
     [junit] TINFO: id:000000000000
     [junit] TINFO: rand:-00953139433
     [junit] TINFO: :
     [junit] TINFO: body:body
     [junit] TINFO: id:000000000001
     [junit] TINFO: rand:000015869780

Here's several docs being merged together:

     [junit] TINFO: :
     [junit] TINFO: body:body
     [junit] TINFO: id:000000000009
     [junit] TINFO: rand:-00563669564
     [junit] TINFO: :
     [junit] TINFO: body:body
     [junit] TINFO: id:000000000000
     [junit] TINFO: id:000000000001
     [junit] TINFO: id:000000000002
     [junit] TINFO: id:000000000003
     [junit] TINFO: id:000000000004
     [junit] TINFO: id:000000000005
     [junit] TINFO: id:000000000006
     [junit] TINFO: id:000000000007
     [junit] TINFO: id:000000000008
     [junit] TINFO: id:000000000009
     [junit] TINFO: rand:-00072576061
     [junit] TINFO: rand:-00260794310
     [junit] TINFO: rand:-00563669564
     [junit] TINFO: rand:-00953139433
     [junit] TINFO: rand:-01094000683
     [junit] TINFO: rand:-01481464619
     [junit] TINFO: rand:-02099458317
     [junit] TINFO: rand:000015869780
     [junit] TINFO: rand:001019870061
     [junit] TINFO: rand:001565603387
     [junit] TINFO: :
     [junit] TINFO: body:body
     [junit] TINFO: id:000000000010
     [junit] TINFO: rand:001271292228

At some point, late in the merge process, this happens:

     [junit] TermInfosWriter: rand:-00449774276
     [junit] TermInfosWriter: rand:-00467363681
     [junit] TermInfosWriter: rand:-00479945420
     [junit] TermInfosWriter: rand:-00506239929
     [junit] TermInfosWriter: :                  // Huh????
     [junit] TermInfosWriter: rand:-00512006124
     [junit] TermInfosWriter: rand:-00526876979  // <- look at this  
     [junit] TermInfosWriter: rand:-00531589361
     [junit] TermInfosWriter: rand:-00563669564
     [junit] TermInfosWriter: rand:-00638261924

Here's the first few terms coming off of a Term Enum, later.  As you  
can see, the sort order is messed up.  That's because the .tis stream  
has gotten out of sync somehow.

     [junit] TERMS:
     [junit] rand:26876979  // <- the last few digits of that number  
from earlier
     [junit] rand:31589361
     [junit] rand:63669564
     [junit] rand:638261924
     [junit] rand:733778983
     [junit] rand:770310547
     [junit] rand:806409190
     [junit] rand:849606785
     [junit] rand:869935672
     [junit] rand:927974448
     [junit] rand:953139433
     [junit] rand:954514004
     [junit] rand:961290394
     [junit] rand:1067018129
     [junit] rand:1081398108
     [junit] rand:1094000683
     [junit] rand:1139978555
     [junit] rand:1231799109

I'm stumped for now.

Marvin Humphrey
Rectangular Research

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message