Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@lucene.apache.org
Date: Mon, 28 Jul 2014 04:29:39 +0000 (UTC)
From: "Ramkumar Aiyengar (JIRA)" <jira@apache.org>
To: dev@lucene.apache.org
Message-ID: <JIRA.12729499.1406222143530.53818.1406521779879@arcas>
In-Reply-To: <JIRA.12729499.1406222143530@arcas>
References: <JIRA.12729499.1406222143530@arcas>
Subject: [jira] [Comment Edited] (SOLR-6275) Improve accuracy of QTime
 reporting
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/SOLR-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075884#comment-14075884 ] 

Ramkumar Aiyengar edited comment on SOLR-6275 at 7/28/14 4:28 AM:
------------------------------------------------------------------

bq. Hmmm, I'm not sure I agree. QTime is useful for knowing how long the lower level stuff took. Thus it's useful as it stands.

Can you elaborate what you mean by "lower level stuff"? If you are intending QTime to reflect time taken by Lucene, i.e. roughly the time taken by the top-level collector to collect the results needed, QTime covers way more than that already, including time taken for query parsing and finishing up just in the non-distributed case. In the distributed case, it also covers time taken waiting on the shard handler factory pool, network latency, any servlet container pooling time at the processing shards, time taken waiting for the federating node to take all responses, and time taken to merge all responses.

QTime as it stands is currently defined from the time the {{SolrQueryRequest}} object is created till the response is rendered, which is hard to associate any semantic meaning to, and it roughly is all steps except the logic required for resolving the core to send the request to. All I am doing is adding that as well, to logically mean (again roughly, but less roughly) "time taken by the servlet to service the request".

bq. I mean we're talking human-time here. When I'm asking how long a query took, my frail human system is not capable of noticing a difference of a millisecond or two. nanoTime is not necessarily all that accurate either. Sure, it gives a lot or precision, but that's different from accuracy.

The problem is not because of millisecond vs. nanosecond accuracy. {{currentTimeMillis}} represents the wall clock of the system and is subject to issues like clock skew. For example, if NTP resets the time or if the sys admin changes the time for some reason or if some other action changes the wall time, the difference between two such measurements can be totally incorrect (including being negative). Which is why {{nanoTime}} is preferred for timing measurements as it's guaranteed to be monotonic (where the OS supports it, i.e. everywhere except some older versions of Windows). See SOLR-5734 for some context where we changed all references of this thing, QTime was left out.

bq. Frankly, this doesn't seem worth the change to me. And especially if we return the full nanoTime string which would render any current tools for reporting performance metrics invalid by a factor 100,000, all for extremely questionable utility.

We are still reporting as milliseconds, no change in resolution here. The {{RTimer}} utility class used for this purpose already does the conversion.


was (Author: andyetitmoves):
bq. Hmmm, I'm not sure I agree. QTime is useful for knowing how long the lower level stuff took.
Thus it's useful as it stands.

Can you elaborate what you mean by "lower level stuff"? If you are intending QTime to reflect time taken by Lucene, i.e. roughly the time taken by the top-level collector to collect the results needed, QTime covers way more than that already, including time taken for query parsing and finishing up just in the non-distributed case. In the distributed case, it also covers time taken waiting on the shard handler factory pool, network latency, any servlet container pooling time at the processing shards, time taken waiting for the federating node to take all responses, and time taken to merge all responses.

QTime as it stands is currently defined from the time the {{SolrQueryRequest}} object is created till the response is rendered, which is hard to associate any semantic meaning to, and it roughly is all steps except the logic required for resolving the core to send the request to. All I am doing is adding that as well, to logically mean (again roughly, but less roughly) "time taken by the servlet to service the request".

bq. I mean we're talking human-time here. When I'm asking how long a query took, my frail human system is not capable of noticing a difference of a millisecond or two. nanoTime is not necessarily all that accurate either. Sure, it gives a lot or precision, but that's different from accuracy.

The problem is not because of millisecond vs. nanosecond accuracy. {{currentTimeMillis}} represents the wall clock of the system and is subject to issues like clock skew. For example, if NTP resets the time or if the sys admin changes the time for some reason or if some other action changes the wall time, the difference between two such measurements can be totally incorrect (including being negative). Which is why {{nanoTime}} is preferred for timing measurements as it's guaranteed to be monotonic (where the OS supports it, i.e. everywhere except some older versions of Windows). See SOLR-5734 for some context where we changed all references of this thing, QTime was left out.

bq. Frankly, this doesn't seem worth the change to me. And especially if we return the full nanoTime string which would render any current tools for reporting performance metrics invalid by a factor 100,000, all for extremely questionable utility.

We are still reporting as milliseconds, no change in resolution here. The {{RTimer}} utility class used for this purpose already does the conversion.


> Improve accuracy of QTime reporting
> -----------------------------------
>
>                 Key: SOLR-6275
>                 URL: https://issues.apache.org/jira/browse/SOLR-6275
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Ramkumar Aiyengar
>            Priority: Minor
>
> Currently, {{QTime}} uses {{currentTimeMillis}} instead of {{nanoTime}} and hence is not suitable for time measurements. Further, it is really started after all the dispatch logic in {{SolrDispatchFilter}} (same with the top level timing reported by {{debug=timing}}) which may or may not be expensive, and hence may not fully represent the time taken by the search. This is to remedy both cases.


--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org