lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shawn Heisey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile
Date Tue, 23 Oct 2012 07:27:13 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482171#comment-13482171
] 

Shawn Heisey commented on SOLR-1972:
------------------------------------

After poking around a lot looking for a way to bump the reservoir size, I finally came across
the paper on reservoir sampling by Vitter.  After even more poking around, I think I get it
now.  Their small reservoir apparently really does give statistically relevant results over
millions or billions of total samples.  If it didn't give them numbers they could use, they
would have already made it larger.

Do you think it's worthwhile to give people the ability to customize the percentile list --
turn some of the standard percentiles off, and/or add custom ones?  As soon as we conclude
that including the full predefined set won't present a performance problem because it only
gets calculated when the admin GUI is accessed, there'll be someone who has created hundreds
of request handlers and polls the statistics for all of them once a minute.  I can also see
someone wanting to see the 12th and 87th percentiles for some reason neither of us can fathom,
but makes perfect sense to them.

                
> Need additional query stats in admin interface - median, 95th and 99th percentile
> ---------------------------------------------------------------------------------
>
>                 Key: SOLR-1972
>                 URL: https://issues.apache.org/jira/browse/SOLR-1972
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 1.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>         Attachments: elyograg-1972-3.2.patch, elyograg-1972-3.2.patch, elyograg-1972-trunk.patch,
elyograg-1972-trunk.patch, SOLR-1972-branch3x-url_pattern.patch, SOLR-1972-branch4x.patch,
SOLR-1972-branch4x.patch, SOLR-1972_metrics.patch, SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch,
SOLR-1972.patch, SOLR-1972-url_pattern.patch
>
>
> I would like to see more detailed query statistics from the admin GUI.  This is what
you can get now:
> requests : 809
> errors : 0
> timeouts : 0
> totalTime : 70053
> avgTimePerRequest : 86.59209
> avgRequestsPerSecond : 0.8148785 
> I'd like to see more data on the time per request - median, 95th percentile, 99th percentile,
and any other statistical function that makes sense to include.  In my environment, the first
bunch of queries after startup tend to take several seconds each.  I find that the average
value tends to be useless until it has several thousand queries under its belt and the caches
are thoroughly warmed.  The statistical functions I have mentioned would quickly eliminate
the influence of those initial slow queries.
> The system will have to store individual data about each query.  I don't know if this
is something Solr does already.  It would be nice to have a configurable count of how many
of the most recent data points are kept, to control the amount of memory the feature uses.
 The default value could be something like 1024 or 4096.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message