lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lance Norskog (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-1383) Replication causes master to fail to delete old index files
Date Sat, 29 Aug 2009 21:52:32 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749190#action_12749190
] 

Lance Norskog commented on SOLR-1383:
-------------------------------------

Here is an attempt at a completely described test scenario. This test scenario requires a
Linux system or other system that supports the /proc/PROCESSID/fd feature.

----------------------------------------------------------------------------------------------------------------------------------------------------



Do a full checkout and build of the Solr 1.4 trunk.
Make two copies of the example directory: call them 'master' and 'slave'.
In slave/etc/jetty.xml, change port #8983 to port #8080. It is in two places in the file.

In master/solr/conf/solrconfig.xml, uncomment this block:

{code:xml}
<requestHandler name="/replication" class="solr.ReplicationHandler" >
    <lst name="master">
        <str name="replicateAfter">commit</str>
         <str name="confFiles">schema.xml,stopwords.txt</str>
    </lst>
</requestHandler>
{code}

In slave/solr/conf/solrconfig.xml, uncomment this block:
{code:xml}
<requestHandler name="/replication" class="solr.ReplicationHandler">
    <lst name="slave">
        <str name="masterUrl">http://localhost:8983/solr/replication</str>
        <str name="pollInterval">00:00:60</str>
     </lst>
</requestHandler>
{code}
Now, in both master/ and slave/, run 'java -jar start.jar'. The master and slave solr should
be running. You can test this by loading the urls [http://localhost:8983/solr/admin] and [http://localhost:8080/solr/admin]
.

In a text window, go to the example/exampledocs directory and run this script

{code}
for x in *.xml
do
	echo $x
	sh post.sh $x
	sleep 10
	curl "http://localhost:8080/solr/replication?command=fetchindex"
	sleep 10
done
{code}

This prints each example file, indexes it, and does a replication command. At the end of this
script, the master and slave solr/data/index files will be identical.

Now, kill master & slave Solr instances, remove the solr/index/data directories, and restart
them. Save the process IDs of the master and slave java processes.

Run the test script without sleep breaks:
{code}
for x in *.xml
do
	echo $x
	sh post.sh $x
	curl "http://localhost:8080/solr/replication?command=fetchindex"
done
{code}

At this point you should now have a small set of files in the slave solr/data/index/ directory.
Their names will probably start with _a, _b,and _c. The master solr/data/index/ directory
will have these files and also older files starting _0, _1, _2 on up. These are older-generation
files in the Lucene index and should be deleted at some point. 

On the master do 'ls -l /proc/PID/fd' where PID is the process ID of the master Java process.
This shows all of the open file descriptors of the process. The old files (starting _0, _1,
_2 etc) are not held open by the master process. The master Java process only holds open the
same index files that are in the slave solr/data/index/.


> Replication causes master to fail to delete old index files
> -----------------------------------------------------------
>
>                 Key: SOLR-1383
>                 URL: https://issues.apache.org/jira/browse/SOLR-1383
>             Project: Solr
>          Issue Type: Bug
>          Components: replication (java)
>         Environment: Linux CentOS - latest Solr 1.4 trunk - Java 1.6
>            Reporter: Lance Norskog
>             Fix For: 1.4
>
>
> I have developed a way to make replication leave old index files in the master's data/index
directory. It is timing-dependent. A sequence of commands runs correctly or fails, depending
on the timing between the commands.
> Here is the test scenario:
> Start a master and slave version of the Solr distributed example. I used 8080 for the
slave. (See example/etc/jetty.xml)
> Be sure to start with empty solr/data/index files on both master and slave.
> Open the replication administration jsp on the slave ( http://localhost:8080/solr/admin/replication/index.jsp
)
> Disable polling.
> In a text window, go to the example/exampledocs directory and run this script
> {code}
> for x in *.xml
> do
> 	echo $x
> 	sh post.sh $x
> 	sleep 15
> 	curl "http://localhost:8080/solr/replication?command=fetchindex"
> done
> {code}
> This prints each example file, indexes it, and does a replication command. At the end
of this exercise, the master and slave solr/data/index files will be identical.
> Now, kill master & slave, remove the solr/index/data directories, and start over.
 This time, remove the sleep command from the script. In my environment, old Lucene index
files were left in the master's data/index. Here is what is left in the master data/index.

>  The segments_? files are random across runs, but the index files left over are consistent.
> Note (courtesy of the Linux 'ls -l /proc/PID/fd' command) that the old files are not
kept open by the master solr; they are merely left behind.
> In the master server:
> {code}
> % ls solr/data/index
> _0.fdt  _1.prx  _2.tvx  _4.nrm  _5.tii  _7.frq  _8.tvd  _a.tvx  _c.nrm
> _0.fdx  _1.tii  _3.fdt  _4.prx  _5.tis  _7.nrm  _8.tvf  _b.fdt  _c.prx
> _0.fnm  _1.tis  _3.fdx  _4.tii  _6.fdt  _7.prx  _8.tvx  _b.fdx  _c.tii
> _0.frq  _2.fdt  _3.fnm  _4.tis  _6.fdx  _7.tii  _a.fdt  _b.fnm  _c.tis
> _0.nrm  _2.fdx  _3.frq  _4.tvd  _6.fnm  _7.tis  _a.fdx  _b.frq  segments.gen
> _0.prx  _2.fnm  _3.nrm  _4.tvf  _6.frq  _8.fdt  _a.fnm  _b.nrm  segments_8
> _0.tii  _2.frq  _3.prx  _4.tvx  _6.nrm  _8.fdx  _a.frq  _b.prx  segments_9
> _0.tis  _2.nrm  _3.tii  _5.fdt  _6.prx  _8.fnm  _a.nrm  _b.tii  segments_a
> _1.fdt  _2.prx  _3.tis  _5.fdx  _6.tii  _8.frq  _a.prx  _b.tis  segments_b
> _1.fdx  _2.tii  _4.fdt  _5.fnm  _6.tis  _8.nrm  _a.tii  _c.fdt  segments_c
> _1.fnm  _2.tis  _4.fdx  _5.frq  _7.fdt  _8.prx  _a.tis  _c.fdx  segments_d
> _1.frq  _2.tvd  _4.fnm  _5.nrm  _7.fdx  _8.tii  _a.tvd  _c.fnm
> _1.nrm  _2.tvf  _4.frq  _5.prx  _7.fnm  _8.tis  _a.tvf  _c.frq
> {code}
> {code}
> % ls -l /proc/PID/fd
> lr-x------ 1 root root 64 Aug 25 22:52 137 -> /index/master/solr/data/index/_a.tis
> lr-x------ 1 root root 64 Aug 25 22:52 138 -> /index/master/solr/data/index/_a.frq
> lr-x------ 1 root root 64 Aug 25 22:52 139 -> /index/master/solr/data/index/_a.prx
> lr-x------ 1 root root 64 Aug 25 22:52 140 -> /index/master/solr/data/index/_a.fdt
> lr-x------ 1 root root 64 Aug 25 22:52 141 -> /index/master/solr/data/index/_a.fdx
> lr-x------ 1 root root 64 Aug 25 22:52 142 -> /index/master/solr/data/index/_a.tvx
> lr-x------ 1 root root 64 Aug 25 22:52 143 -> /index/master/solr/data/index/_a.tvd
> lr-x------ 1 root root 64 Aug 25 22:52 144 -> /index/master/solr/data/index/_a.tvf
> lr-x------ 1 root root 64 Aug 25 22:52 145 -> /index/master/solr/data/index/_a.nrm
> lr-x------ 1 root root 64 Aug 25 22:52 72 -> /index/master/solr/data/index/_b.tis
> lr-x------ 1 root root 64 Aug 25 22:52 73 -> /index/master/solr/data/index/_b.frq
> lr-x------ 1 root root 64 Aug 25 22:52 74 -> /index/master/solr/data/index/_b.prx
> lr-x------ 1 root root 64 Aug 25 22:52 76 -> /index/master/solr/data/index/_b.fdt
> lr-x------ 1 root root 64 Aug 25 22:52 78 -> /index/master/solr/data/index/_b.fdx
> lr-x------ 1 root root 64 Aug 25 22:52 79 -> /index/master/solr/data/index/_b.nrm
> lr-x------ 1 root root 64 Aug 25 22:52 80 -> /index/master/solr/data/index/_c.tis
> lr-x------ 1 root root 64 Aug 25 22:52 81 -> /index/master/solr/data/index/_c.frq
> lr-x------ 1 root root 64 Aug 25 22:52 82 -> /index/master/solr/data/index/_c.prx
> lr-x------ 1 root root 64 Aug 25 22:52 83 -> /index/master/solr/data/index/_c.fdt
> lr-x------ 1 root root 64 Aug 25 22:52 84 -> /index/master/solr/data/index/_c.fdx
> lr-x------ 1 root root 64 Aug 25 22:52 85 -> /index/master/solr/data/index/_c.nrm
> {code}
> In the slave server:
> {code}
> % ls solr/data/index
> _a.fdt  _a.tvd  _b.prx  _c.prx
> _a.fdx  _a.tvf  _b.tii  _c.tii
> _a.fnm  _a.tvx  _b.tis  _c.tis
> _a.frq  _b.fdt  _c.fdt  lucene-d81c111653e4c4883a6fbd7e2effd596-n-write.lock
> _a.nrm  _b.fdx  _c.fdx  segments.gen
> _a.prx  _b.fnm  _c.fnm  segments_d
> _a.tii  _b.frq  _c.frq
> _a.tis  _b.nrm  _c.nrm
> {code}
> {code}
> % ls -l /proc/PID/fd
> lr-x------ 1 root root 64 Aug 25 22:57 139 -> /index/slave/solr/data/index/_a.tis
> lr-x------ 1 root root 64 Aug 25 22:57 140 -> /index/slave/solr/data/index/_a.frq
> lr-x------ 1 root root 64 Aug 25 22:57 141 -> /index/slave/solr/data/index/_a.prx
> lr-x------ 1 root root 64 Aug 25 22:57 142 -> /index/slave/solr/data/index/_a.fdt
> lr-x------ 1 root root 64 Aug 25 22:57 143 -> /index/slave/solr/data/index/_a.fdx
> lr-x------ 1 root root 64 Aug 25 22:57 144 -> /index/slave/solr/data/index/_a.tvx
> lr-x------ 1 root root 64 Aug 25 22:57 145 -> /index/slave/solr/data/index/_a.tvd
> lr-x------ 1 root root 64 Aug 25 22:57 146 -> /index/slave/solr/data/index/_a.tvf
> lr-x------ 1 root root 64 Aug 25 22:57 147 -> /index/slave/solr/data/index/_a.nrm
> lr-x------ 1 root root 64 Aug 25 22:57 4 -> /index/slave/solr/data/index/_b.tis
> lr-x------ 1 root root 64 Aug 25 22:57 75 -> /index/slave/solr/data/index/_b.frq
> lr-x------ 1 root root 64 Aug 25 22:57 76 -> /index/slave/solr/data/index/_b.prx
> lr-x------ 1 root root 64 Aug 25 22:57 77 -> /index/slave/solr/data/index/_b.fdt
> lr-x------ 1 root root 64 Aug 25 22:57 78 -> /index/slave/solr/data/index/_b.fdx
> lr-x------ 1 root root 64 Aug 25 22:57 79 -> /index/slave/solr/data/index/_b.nrm
> lr-x------ 1 root root 64 Aug 25 22:57 80 -> /index/slave/solr/data/index/_c.tis
> lr-x------ 1 root root 64 Aug 25 22:57 81 -> /index/slave/solr/data/index/_c.frq
> lr-x------ 1 root root 64 Aug 25 22:57 82 -> /index/slave/solr/data/index/_c.prx
> lr-x------ 1 root root 64 Aug 25 22:57 83 -> /index/slave/solr/data/index/_c.fdt
> lr-x------ 1 root root 64 Aug 25 22:57 84 -> /index/slave/solr/data/index/_c.fdx
> lr-x------ 1 root root 64 Aug 25 22:57 85 -> /index/slave/solr/data/index/_c.nrm
> lrwx------ 1 root root 64 Aug 25 22:57 86 -> /index/slave/solr/data/index/lucene-d81c111653e4c4883a6fbd7e2effd596-n-write.lock
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message