lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-4661) Index Version & Gen Number out of sync on Admin UI
Date Tue, 02 Apr 2013 19:03:16 GMT

     [ https://issues.apache.org/jira/browse/SOLR-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hoss Man updated SOLR-4661:
---------------------------

    Attachment: hoss_test.zip

bq. Step to reproduce this issue is simple. Just apply commit on master without posting any
documents

Ah... ok, wait a minute - this looks promising.  in reviewing the logs you sent to solr-user@lucene
i didn't notice that this "empty commit" was happening, but in attempting to reproduce i think
i see what you're talking about.

Steps i followed...


{panel}
1) took the Solr 4.2 example/solr dir and cloned it as "master-home"

2) edited the /replication handler to match what you posted in this issue (but adjusted the
replicaiton time to 30 seconds for a faster test)

3) cloned "master-home" as "slave-home"

4) ran two instances of Solr 4.2 using hte following commands...

{noformat}
java -Denable.master=true -Dmaster.port=8999 -Djetty.port=8999 -Dsolr.solr.home=/home/hossman/tmp/ave_version_higher/master-home
-jar start.jar &> /home/hossman/tmp/slave_version_higher/master.log
java -Denable.slave=true -Dmaster.port=899-Djetty.port=9999 -Dsolr.solr.home=/home/hossman/tmp/slave_version_higher/slave-home
-jar start.jar &> /home/hossman/tmp/slave_version_higher/slave.log
{noformat}

5) ran two scripts to monitor replication details using the following commands...

{noformat}
while true; do date --utc && curl -sS "http://localhost:9999/solr/collection1/replication?command=details&indent=truewt=json"
&& echo && echo && sleep 2; done &> slave_rep_details.txt
while true; do date --utc && curl -sS "http://localhost:8999/solr/collection1/replication?command=details&indent=truewt=json"
&& echo && echo && sleep 2; done &> master_rep_details.txt
{noformat}

6) triggered an indexing of all the example docs on master, and waited for replication.

{noformat}
java -Durl=http://localhost:8999/solr/collection1/update -jar post.jar *.xml
{noformat}

7) triggered an explicit commit on master...

{noformat}
java -Durl=http://localhost:8999/solr/collection1/update -jar post.jar -
{noformat}

8) shutdown both servers and the scripts (Ctrl-C)
{panel}

I've attached the full logs and home dirs at the completion of this test, but as a summmary
of the results...

{panel}
a) slave & master index files are identical except for segments.gen

b) the master's replication details indicate that the current commit being used is "indexVersion#1364927050819,
generation#2" but it's list of commits does not include this, it contains a single commit
of "indexVersion#1364927114002, generation#3"

c) the slave's replication details indicate that the current commit being used is "indexVersion#1364927050819,
generation#2" and that this is the only commit it has locally.  The slave's information about
hte master is consistent with what the master itself reports.
{panel}

I'm not certain, but I believe this is just an optimization where the searcher is not re-opened
when the currently opened "commit" is identical to the new commit -- this optimization is
working on the master, but aparently not on the slave (maybe the slave can't tell that the
commits are identical?)  

FWIW: after running this test, i restarted the master and it's replication details were consistent
with the list of commit points -- it was using generation #3.   
You can also observe the exact same behavior from master's replication details (current generation
lower then the generation of any commit point) if you do a hard commit with openSearcher=false.
 

----

I think most of the behavior here makes sense -- the slave is replicating the commits from
the master, even if the master isn't using them yet because it hasn't opened a new searcher.
 The key questions i wonder about:

1) why was segments.get different when i ran my experiment? is that normal?
2) Assuming i'm correct about their being an optimization to not open a new searcher if the
commits are identical, can we make this same optimization work on slaves in the case of replication?




                
> Index Version & Gen Number out of sync on Admin UI
> --------------------------------------------------
>
>                 Key: SOLR-4661
>                 URL: https://issues.apache.org/jira/browse/SOLR-4661
>             Project: Solr
>          Issue Type: Bug
>          Components: replication (java), web gui
>    Affects Versions: 4.2
>         Environment: Solr 4.2 on Linux with JBoss 7.1.1, JDK 1.7
>            Reporter: Aditya
>              Labels: gui, replication, web
>         Attachments: hoss_test.zip, IndexVersionSyncIssue.jpg
>
>
> Index and Gen number on Slave is higher than master. 
> If you apply commit on master with no pending docs then the commit time stamp and gen
is incremented. When Slaves polls master for replication it see the index version difference
and starts replicating but all files are skipped. 
> On Admin UI (on Slaves) the version number displayed for master is old where as for slave
is the latest which is higher than master.
> Below is the response from master (/replication?command=details) where i see two different
Version an Gen numbers. This creates confusion of having version out of sync, though its not.

> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">1</int>
> </lst>
> <lst name="details">
> <str name="indexSize">1.52 GB</str>
> <str name="indexPath">/storage/solrdata/index/</str>
> <arr name="commits">
> <lst>
> <long name="indexVersion">{color:red}1364835609803{color}</long>
> <long name="generation">{color:red}34{color}</long>
> <arr name="filelist">...</arr>
> </lst>
> </arr>
> <str name="isMaster">true</str>
> <str name="isSlave">false</str>
> <long name="indexVersion">{color:red}1364778010902{color}</long>
> <long name="generation">{color:red}31{color}</long>
> <lst name="master">
> <str name="confFiles">schema.xml</str>
> <arr name="replicateAfter">
> <str>commit</str>
> <str>startup</str>
> </arr>
> <str name="replicationEnabled">true</str>
> <long name="replicatableGeneration">34</long>
> </lst>
> </lst>
> <str name="WARNING">
> This response format is experimental. It is likely to change in the future.
> </str>
> </response>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message