lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Thacker (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-9866) Reduce memory pressure for expand component
Date Thu, 15 Dec 2016 02:43:58 GMT

     [ https://issues.apache.org/jira/browse/SOLR-9866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Varun Thacker updated SOLR-9866:
--------------------------------
    Attachment: SOLR-9866.patch
                patch-gc.png
                patch-3k.png
                patch-1.png
                original-gc.png
                original-3k.png
                original-1.png

The 1 query results looked promising but at 3k they look roughly the same. Maybe I sampled
incorrectly for the 1 query test.


So this is till a beta stage of debugging the root cause. The client was load testing for
3k queries per second and one a similar dataset were seeing ~210GB/minute of freed memory
in GC viewer as compared to ~50 without collapse and expand. 



> Reduce memory pressure for expand component
> -------------------------------------------
>
>                 Key: SOLR-9866
>                 URL: https://issues.apache.org/jira/browse/SOLR-9866
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Varun Thacker
>         Attachments: SOLR-9866.patch, original-1.png, original-3k.png, original-gc.png,
patch-1.png, patch-3k.png, patch-gc.png
>
>
> A client was having memory pressure issues when running queries with collapse and expand.
> I created a setup on my machine with dummy data to reproduce this. This ticket is concentrating
just on the expand part as that's the top culprit according to some sampling I did with YourKit.
> Started Solr using  - {{./bin/solr start -p 8984 -m 4g}} and created a collection called
"ct" ( collapse testing )
> The indexing code below indexes 10M records. We have every 1 out of 10 documents as duplicates.
> {code}
> public void index() throws Exception {
>     HttpSolrClient client = new HttpSolrClient.Builder().withBaseSolrUrl("http://localhost:8983/solr").build();
>     client.deleteByQuery("ct", "*:*");
>     client.commit("ct");
>     //Index 10M documents , with every 1/10 document as a duplicate.
>     List<SolrInputDocument> docs = new ArrayList<>(1000);
>     for(int i=0; i<1000*1000*10; i++) {
>       SolrInputDocument doc = new SolrInputDocument();
>       doc.addField("id", i);
>       if (i%10 ==0 && i!=0) {
>         doc.addField("collapseField1_s", i-1); //with docValues
>         doc.addField("collapseField1_s", i-1); //without docValues
>       } else {
>         doc.addField("collapseField1_s", i); //with docValues
>         doc.addField("collapseField1_s", i); //without docValues
>       }
>       docs.add(doc);
>       if (docs.size() == 1000) {
>         client.add("ct", docs);
>         docs.clear();
>       }
>     }
>     client.commit("ct");
>   }
> {code}
> I wrote a script to fire 3k such queries {{&fq=\{!collapse field=collapseField1\}&expand=true&expand.rows=1000}}
> I enabled "Object Allocation Recording" on YourKit and I am attaching 2 sets screenshots:

>  - Stock Solr 6.3 : For 1 query (original-1) and for the 3k queries (original-3k) and
also GC logs during the 3k query run
>  - Patched Solr: For 1 query (patch-1) and for the 3k queries (patch-3k) and also GC
logs during the 3k query run
> The patch is nothing but tweaking the initial allocation sizes. I haven't fully verified
if it's correct , but {{TestExpandComponent}} was happy
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message