lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-6349) LocalParams for enabling/disabling individual stats
Date Tue, 24 Feb 2015 00:51:12 GMT

     [ https://issues.apache.org/jira/browse/SOLR-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hoss Man updated SOLR-6349:
---------------------------
    Attachment: SOLR-6349.patch

some more udpates to the patch...

* StatsComponentTest
** undid an odd calcDistinct param change in testFieldStatisticsDocValuesAndMultiValuedIntegerFacetStats
that shouldn't affect the test goal
*** want to ensure the behavior in this test isn't broken by changes
** fixed testFieldStatisticsDocValuesAndMultiValuedDouble
*** was doing stats.field twice in same request diff ways, but only checking one
*** changed to do 2 explicit requests and assert results are the same
*** added in canary assert for future numeric stats
** testIndividualStatLocalParams
*** added a canary assert to protect us against new stats in the future w/o updating the test
**** canary helped catch that we weren't testing calcdistinct in these permutations
*** added some sanity checks of localparams with 'false' values inspired by bug i found in
testCalcDistinctStats
**** see question & nocommit (below)
*** added comment explaining point of isShard queries as best i understanding, see question
& nocommit (below)
*** fixed asserts to play nice with calcdistinct excentricities
** iterateParaCombination
*** kind of hard to understand what this is doing and how it works because of recursive nature
*** definitely need to replace magic number "8" since that is brittle against future stats
and already doesn't jive with num of legal Stat params (missing calcdistinct)
*** in general, i want to refactor this to replace it with commons-math's Combinations class
- i'll look into that tomorow
** testCalcDistinctStats
*** part of the importance here is in the equivilence relationships - so i refactored each
of the equivilent asesrt conditions to be a single assert inside a loop over the params.
**** this helps protect us against someone later thinking it's okay to change one assert w/o
changing all of the equivilences
*** also simplified asserts to be less brittle: assert if expected stat keys are in response
or not -- not number of stat keys returned (which might change in future if more default stats
added -- in the case of these asserts, that isn't really relevant, what we care about here
is the behavior of interaction of the various ways to request calcdistinct)
*** added some more permutations to each of the equivilences sets
**** this lead to discovring a semantics question so far undiscussed as far as what if only
one stat is specified in a local param, but it's value is 'false' (see below)
*** fixed some mistakes in formatting params (eg: "f.a_i.stats.calcdistinct=false)"="true")
*** these changes let me eliminate the unused expectedStats and expectedType maps from this
method (there were virtually unused cut/paste cruft prior to these changes anyway)
* TestDistributedSearch
** added some asserts that the (distrib) handling of calcdistinct works a variety of ways
* DistributedFacetPivotSmallTest
** small update to assert that when stats hang off pivots the local params correctly control
which stats are returned.

----

New Questions...

* see nocommit in StatsComponentTest.testIndividualStatLocalParams - why the double loop here?{noformat}
   // whitebox test: explicitly ask for isShard=true with an individual stat
   //
   // :nocommit: why loop over every stat to get it's deps and then loop over them? ...
   // ... isn't it enough to loop over the set of all known stats?
   for (Stat statAsk:expectedStats.keySet()) {
     EnumSet <Stat> statAskEvaluate = statAsk.getDistribDeps();

     for (Stat stat : statAskEvaluate){
{noformat}

* how should these two queries behave...{noformat}
a) stats = true & stats.field = {!key=k min='false'}a_i

b) stats = true & stats.field = {!key=k min='false'}a_i & stats.calcdistinct = true
{noformat}...my gut says that for (a) the stats result set should be completley empty; and
for (b) only countDistinct and distinctValues should be returned; -- because in both cases
because the _use_ of localparams regardless of values should indicate that the implicit set
of default stats should be ignored, and only explicitly requested stats should be returned
-- but the only explicity mentioned stat via localparams is deliberately disabled with 'false'.
 at the moment however, both of these cases returns all of the implicit default stats (see
nocommits in tests) -- we'll need to fix this



> LocalParams for enabling/disabling individual stats
> ---------------------------------------------------
>
>                 Key: SOLR-6349
>                 URL: https://issues.apache.org/jira/browse/SOLR-6349
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Hoss Man
>         Attachments: SOLR-6349-tflobbe.patch, SOLR-6349-tflobbe.patch, SOLR-6349-tflobbe.patch,
SOLR-6349-xu.patch, SOLR-6349-xu.patch, SOLR-6349-xu.patch, SOLR-6349-xu.patch, SOLR-6349.patch,
SOLR-6349.patch, SOLR-6349.patch, SOLR-6349___bad_idea_broken.patch
>
>
> Stats component currently computes all stats (except for one) every time because they
are relatively cheap, and in some cases dependent on eachother for distrib computation --
but if we start layering stats on other things it becomes unnecessarily expensive to compute
all the stats when they just want the "sum" (and it will definitely become excessively verbose
in the responses).  
> The plan here is to use local params to make this configurable.  All of the existing
stat options could be modeled as a simple boolean param, but future params (like percentiles)
might take in a more complex param value...
> Example:
> {noformat}
> stats.field={!min=true max=true percentiles='99,99.999'}price
> stats.field={!mean=true}weight
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message