lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Narang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-10317) Solr Nightly Benchmarks
Date Mon, 22 May 2017 05:36:04 GMT

    [ https://issues.apache.org/jira/browse/SOLR-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16019044#comment-16019044
] 

Vivek Narang edited comment on SOLR-10317 at 5/22/17 5:35 AM:
--------------------------------------------------------------

Hi [~ichattopadhyaya], I hope you get well soon. Regarding my doubt (as mentioned in the fourth
last comment), I may have found out the cause of variation in performance. I am providing
the details below. 

Concern: I have been noticing that there was a significant difference in the indexing (10000
documents) throughput (in documents/sec). Even for the same commit test executed at separate
times, a significant difference in throughput was observed. Please see the following screenshot
[http://162.243.101.83/Capture_Throughput_drop_reason.PNG]. I have marked the drops in throughput
on the graph with red circles. You can observe the difference in throughput for example (322
- 263 doc/sec). This difference in throughput was reported by the system for the same commit
id which was not making sense and should not be happening. After running a number of tests,
I may have been able to pinpoint the cause of this abnormality in performance reporting. As
guessed before this might be a *memory issue*. The server that I am using has limited resources
(2 GB RAM). I added more than 5GB of swap space. These specific drops (as marked on the graph
with red circles), might be happening when the test run required building the Solr server
from fresh code before running the benchmark test. This additional activity (of building Solr
from source) may have been causing the RAM to get utilized completely causing the OS to handle
the situation with the swap space provided and this additional swap activity might be causing
the degradation in performance. 


was (Author: vivek.narang@uga.edu):
Hi [~ichattopadhyaya], I hope you get well soon. Regarding my doubt (as mentioned in the fourth
last comment), I have found out the cause of variation in performance. I am providing the
details below. 

Concern: I have been noticing that there was a significant difference in the indexing (10000
documents) throughput (in documents/sec). Even for the same commit test executed at separate
times, a significant difference in throughput was observed. Please see the following screenshot
[http://162.243.101.83/Capture_Throughput_drop_reason.PNG]. I have marked the drops in throughput
on the graph with red circles. You can observe the difference in throughput for example (322
- 263 doc/sec). This difference in throughput was reported by the system for the same commit
id which was not making sense and should not be happening. After running a number of tests,
I was able to pinpoint the cause of this abnormality in performance reporting. As guessed
before this was a *memory issue*. The server that I am using has limited resources (2 GB RAM).
I added more than 5GB of swap space. These specific drops (as marked on the graph with red
circles), was happening when the test run required building the Solr server from fresh code
before running the benchmark test. This additional activity (of building Solr from source)
was causing the RAM to get utilized completely causing the OS to handle the situation with
the swap space provided and this additional swap activity was causing the degradation in performance.
Conclusion: I have figured out the cause of the abnormality in metric estimation and assume
that with a suitable server (having required hardware resources), this abnormality in reporting
will go away. Regards. 

> Solr Nightly Benchmarks
> -----------------------
>
>                 Key: SOLR-10317
>                 URL: https://issues.apache.org/jira/browse/SOLR-10317
>             Project: Solr
>          Issue Type: Task
>            Reporter: Ishan Chattopadhyaya
>              Labels: gsoc2017, mentor
>         Attachments: changes-lucene-20160907.json, changes-solr-20160907.json, managed-schema,
Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks.docx, Narang-Vivek-SOLR-10317-Solr-Nightly-Benchmarks-FINAL-PROPOSAL.pdf,
solrconfig.xml
>
>
> Solr needs nightly benchmarks reporting. Similar Lucene benchmarks can be found here,
https://home.apache.org/~mikemccand/lucenebench/.
> Preferably, we need:
> # A suite of benchmarks that build Solr from a commit point, start Solr nodes, both in
SolrCloud and standalone mode, and record timing information of various operations like indexing,
querying, faceting, grouping, replication etc.
> # It should be possible to run them either as an independent suite or as a Jenkins job,
and we should be able to report timings as graphs (Jenkins has some charting plugins).
> # The code should eventually be integrated in the Solr codebase, so that it never goes
out of date.
> There is some prior work / discussion:
> # https://github.com/shalinmangar/solr-perf-tools (Shalin)
> # https://github.com/chatman/solr-upgrade-tests/blob/master/BENCHMARKS.md (Ishan/Vivek)
> # SOLR-2646 & SOLR-9863 (Mark Miller)
> # https://home.apache.org/~mikemccand/lucenebench/ (Mike McCandless)
> # https://github.com/lucidworks/solr-scale-tk (Tim Potter)
> There is support for building, starting, indexing/querying and stopping Solr in some
of these frameworks above. However, the benchmarks run are very limited. Any of these can
be a starting point, or a new framework can as well be used. The motivation is to be able
to cover every functionality of Solr with a corresponding benchmark that is run every night.
> Proposing this as a GSoC 2017 project. I'm willing to mentor, and I'm sure [~shalinmangar]
and [~markrmiller@gmail.com] would help here.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message