lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Falck" <marcus.fa...@observer.se>
Subject SV: Flushing RAMDir into FSDir
Date Wed, 28 Jun 2006 11:38:33 GMT
Did a clone of the AddIndexes method.
See code below. Anybody seeing any problems with using the AddIndexesWithoutOptimize method
?


// Original
		public virtual void  AddIndexes(Directory[] dirs)
		{
			lock (this)
			{
				Optimize(); // start with zero or 1 seg
				for (int i = 0; i < dirs.Length; i++)
				{
					SegmentInfos sis = new SegmentInfos(); // read infos from dir
					sis.Read(dirs[i]);
					for (int j = 0; j < sis.Count; j++)
					{
						segmentInfos.Add(sis.Info(j)); // add each info
					}
				}
				Optimize(); // final cleanup
			}
		}
// new without optimize calls
        public virtual void AddIndexesWithoutOptimize(Directory[] dirs)
        {
            lock (this)
            {
                for (int i = 0; i < dirs.Length; i++)
                {
                    SegmentInfos sis = new SegmentInfos(); // read infos from dir
                    sis.Read(dirs[i]);
                    for (int j = 0; j < sis.Count; j++)
                    {
                        segmentInfos.Add(sis.Info(j)); // add each info
                    }
                }
                MaybeMergeSegments();
            }
        }

-----Ursprungligt meddelande-----
Från: Marcus Falck [mailto:marcus.falck@observer.se] 
Skickat: den 28 juni 2006 11:46
Till: java-user@lucene.apache.org
Ämne: Flushing RAMDir into FSDir

Hi,

 

I got a lucene based host application that retrieves content for
indexing from fetcher applications.

 

Since I get fresh content all the time I wanted to have full control
over the disc write process.

 

So I ended up using a RAMDirectory and a FSDirectory. 

 

When the content arrives to the application a representation of the
content is written to the disc and the content itself are added as a
Lucene document into the RAMDirectory. When the amount of documents in
the RAMDirectory exceeds my limit I flush the RAMDirectory into the
FSDirectory (using the same limit as minMergeDocs). 

 

This works fine as long as I use the 

fsIndexWriter.AddIndexes(new Directory[] { myRamDir } ); 

approach. However this approach seems to add the newly content to a new
large file  (as long as maxMergeDocs hasn't been exceed). 

 

For example if I got 5000 as minMergeDocs and 15000 as maxMergeDocs  and
I flush the interval 5001-10000 from the ramdir to disc I will end up
with 1 file containing docs 1 - 10000 rather than 2 files ( 1 - 5000 and
5001 - 10000 ).

 

Is there any way to get it to add the new indexes as a new file instead?

Since this severely affects my indexing in a negative way.

 

 

/

Regards 

Marcus Falck

 




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message