lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Moen (Issue Comment Edited) (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (SOLR-3282) Perform Kuromoji/Japanese stability test before 3.6 freeze
Date Wed, 28 Mar 2012 02:49:23 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239714#comment-13239714
] 

Christian Moen edited comment on SOLR-3282 at 3/28/12 2:48 AM:
---------------------------------------------------------------

h3. Test 4 - Combined search and indexing test

In this test, we are both indexing all of Wikipedia while searching.

The search rate is a constant 10 QPS with highlighting.  The queries in this test are identical
to those run above and they are also unique.

Solr is started using

{noformat}
java -verbose:gc -Xmx256m  -Dfile.encoding=UTF-8 -jar start.jar
{noformat}

so I've given it a little more heap because of the memory pressure issue seen in _Test 3_.

The indexing posts the XML described in _Test 1_ - each file contains 1,000 documents and
- different from _Test 1_ we now do a commit after each post.  No optimize is being done.

The test had been running for 8 hours and 33 minutes before I stopped it and 312,900 queries
were run.  Japanese Wikipedia was indexed 23 times.

Full GC occurred 84 times and the maximum heap-size provided to the VM was allocated.  The
longest Full GC times are given below.

|| Longest Full GC (seconds) ||
|1.0789668|
|1.0518156|
|1.0288781|
|0.9973905|
|0.9799409|
|0.9582144|
|0.9555027|
|0.9517524|
|0.9456611|
|0.9387380|
|0.9313493|
|0.9117388|
|0.8771426|
|...|


The longest regular (non-Full) GC times are below.

|| Longest non-Full GC (seconds) | 
|0.1375324|
|0.1206866|
|0.1009028|
|0.0952712|
|0.0928364|
|...|

The VisualVM screenshot suggests that the VM is nice and stable.  It might be good to provide
a little more maximum heap-space than 256MB to index all of Japanese Wikipedia and serve 10
QPS to have a little more headroom, but 256MB seems quite fine.

|| Attachment || Description ||
| long-query-indexing-gc.log | GC log |
| long-search-indexing-visualvm.png | VisualVM screenshot |



                
      was (Author: cm):
    h3. Test 4 - Combined search and indexing test

In this test, we are both indexing all of Wikipedia while searching.

The search rate is a constant 10 QPS.  The queries in this test are identical to those run
above and they are also unique.

Solr is started using

{noformat}
java -verbose:gc -Xmx256m  -Dfile.encoding=UTF-8 -jar start.jar
{noformat}

so I've given it a little more heap because of the memory pressure issue seen in _Test 3_.

The indexing posts the XML described in _Test 1_ - each file contains 1,000 documents and
- different from _Test 1_ we now do a commit after each post.  No optimize is being done.

The test had been running for 8 hours and 33 minutes before I stopped it and 312,900 queries
were run.  Japanese Wikipedia was indexed 23 times.

Full GC occurred 84 times and the maximum heap-size provided to the VM was allocated.  The
longest Full GC times are given below.

|| Longest Full GC (seconds) ||
|1.0789668|
|1.0518156|
|1.0288781|
|0.9973905|
|0.9799409|
|0.9582144|
|0.9555027|
|0.9517524|
|0.9456611|
|0.9387380|
|0.9313493|
|0.9117388|
|0.8771426|
|...|


The longest regular (non-Full) GC times are below.

|| Longest non-Full GC (seconds) | 
|0.1375324|
|0.1206866|
|0.1009028|
|0.0952712|
|0.0928364|
|...|

The VisualVM screenshot suggests that the VM is nice and stable.  It might be good to provide
a little more maximum heap-space than 256MB to index all of Japanese Wikipedia and serve 10
QPS to have a little more headroom, but 256MB seems quite fine.

|| Attachment || Description ||
| long-query-indexing-gc.log | GC log |
| long-search-indexing-visualvm.png | VisualVM screenshot |



                  
> Perform Kuromoji/Japanese stability test before 3.6 freeze
> ----------------------------------------------------------
>
>                 Key: SOLR-3282
>                 URL: https://issues.apache.org/jira/browse/SOLR-3282
>             Project: Solr
>          Issue Type: Task
>          Components: Schema and Analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>            Assignee: Christian Moen
>         Attachments: 250k-queries-no-highlight-gc.log, 250k-queries-no-highlight-visualvm.png,
62k-queries-highlight-gc.log, 62k-queries-highlight-visualvm.png, jawiki-index-gc.log, jawiki-index-gcviewer.png,
jawiki-index-visualvm.png, long-query-indexing-gc.log, long-search-indexing-visualvm.png
>
>
> Kuromoji might be used by many and also in mission critical systems.  I'd like to run
a stability test before we freeze 3.6.
> My thinking is to test the out-of-the-box configuration using fieldtype {{text_ja}} as
follows:
> # Index all of Japanese Wikipedia documents (approx. 1.4M documents) in a never ending
loop
> # Simultaneously run many tens of thousands typical Japanese queries against the index
at 3-5 queries per second with highlighting turned on
> While Solr is indexing and searching, I'd like to verify that:
> * Indexing and queries are working as expected
> * Memory and heap usage looks stable over time
> * Garbage collection is overall low over time -- no Full-GC issues
> I'll post findings and results to this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message