lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <s...@elyograg.org>
Subject Re: Tlog File not removed after hard commit
Date Mon, 25 Mar 2013 21:34:41 GMT
On 3/24/2013 10:02 AM, Niran Fajemisin wrote:
> We import about 1.5 million documents on a nightly basis using DIH. During this time,
we need to ensure that all documents make it into index otherwise rollback on any errors;
which DIH takes care of for us. We also disable autoCommit in DIH but instruct it to commit
at the very end of the import. This is all done through configuration of the DIH config XML
file and the command issued to the request handler.
>
> We have noticed that the tlog file appears to linger around even after DIH has issued
the hard commit. My expectation would be that after the hard commit has occurred, the tlog
file will be removed. I'm obviously misunderstanding how this all works.

You've already gotten the reason for the giant tlog hanging around.

The way to actually fix this problem is to turn on autoCommit with one 
of the values set relatively low.  The key to enabling autoCommit 
without changing anything about how your import process works is this: 
make sure that openSearcher is set to false in the autoCommit:

<updateHandler class="solr.DirectUpdateHandler2">
   <autoCommit>
     <maxDocs>25000</maxDocs>
     <maxTime>300000</maxTime>
     <openSearcher>false</openSearcher>
   </autoCommit>
   <updateLog />
</updateHandler>

I make maxDocs low rather than maxTime, but that's up to you.  Each hard 
commit done by autoCommit will create a new tlog, and each tlog will be 
fairly small.  Only a few of them will be kept around, so the disk space 
requirement will be small, and restarting Solr will be fast because 
there won't be a lot of data to replay.

With openSearcher set to false, there will be NO changes in document 
visibility.  Searches will continue using the old searcher, so the old 
documents will still be there and the new documents will NOT be 
searchable until DIH does its explicit commit at the end.

The one thing that I'm not sure about is what happens if Solr or the 
machine crashes in the middle of the import.  Complete rollback might 
not be possible.  Someone with better knowledge may have to comment there.

Thanks,
Shawn


Mime
View raw message