lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tony Schwartz" <t...@simpleobjects.com>
Subject Re: if delete all docs in segment - when is segment deleted
Date Wed, 20 Jul 2005 12:04:57 GMT
I added the following code to the close() method of IndexWriter to detect a segment that
has all documents deleted upon close.  Does anyone see any problem with this?

=================================
  public synchronized void close() throws IOException {
    flushRamSegments();
    ramDirectory.close();

    if ( directory instanceof FSDirectory && closeDir ) {
    	///////////////////////////
    	// check for any segments that have all docs deleted and remove it.
    	final Vector deletable = new Vector();
    	int len = segmentInfos.size();
    	SegmentReader reader;
    	for ( int i = 0 ; i < len ; i++ ) {
    	  reader = SegmentReader.get( segmentInfos.info( i ) );
   	    if ( reader.numDocs() <= 0 ) { //numDocs excludes deleted docs
   	      deletable.add( reader );
   	    }
    	}
	synchronized (directory) {                 // in- & inter-process sync
	  new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), COMMIT_LOCK_TIMEOUT) {
	      public Object doBody() throws IOException {
	        segmentInfos.write( directory );     // commit before deleting
	        deleteSegments( deletable );  // delete now-unused segments
	        return null;
	      }
	    }.run();
	}
    }

    if (writeLock != null) {
      writeLock.release(); // release write lock
      writeLock = null;
    }
    if(closeDir)
      directory.close();
  }
=================================

Tony Schwartz
tony@simpleobjects.com
"What we need is more cowbell."




> If every doc in a segment is deleted, when does the segment go away?
> Without me having to dig too deep, I was hoping someone could help me prepare for this
> eventuality.  I have an index that grows infinitely.  Old docs are deleted each day just
> before new docs for that day are added.  If I set MaxMergeDocs to some number, say 1
> million, and the segment has 1 million docs in it, and every doc in that segment is
> deleted, will the segment ever be deleted?  If not, how difficult would it be to add
> some type of trigger to detect this "all deleted in segment" condition so lucene could
> remove the huge segment to free disk space.  I'm concerned the segment will never be
> deleted.
>
> Tony Schwartz
> tony@simpleobjects.com
> There are 10 types of people in this world.  Ones that understand binary and ones that
> don't.
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message