lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3124) explain output is confusing when using trie fields (or any field type where the indexed terms are not human readable)
Date Fri, 09 Mar 2012 15:54:57 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226159#comment-13226159
] 

Robert Muir commented on SOLR-3124:
-----------------------------------

In trunk most of the explanation logic is now in the Sim itself: very little is done by the
queries
themselves anymore: just the minimal basics like... toString'ing terms.

It seems the real problem here is how to toString() a Term right?
We currently have the confusing situation that Terms are generally toString()'ed with .utf8ToString():

{code}
  public final String toString() { return field + ":" + bytes.utf8ToString(); }
{code}

but this gives completely unhelpful output when they are actually binary, like this case.
This goes for 
more than just explanations really.

Maybe since all the interning etc is removed and Term is rather simple in 4.0, we should make
it non-final, this way subclasses
could override toString... or maybe it should really be a different method name in general
(doing fancy stuff in toString is scary?)

Anyway i'm not sure its the best approach forward, but I just wanted to put the idea out there.
I think it sucks if explanations
aren't useful... but having subclasses of Term is scary in its own right too, especially if
its just for debugging but breaks
search code.

For example, this exact case is interesting because for a TermQuery.toString(), Term's toString()
is actually not even used:

{code}
if (!term.field().equals(field)) {
  buffer.append(term.field());
  buffer.append(":");
}
buffer.append(term.text());
buffer.append(ToStringUtils.boost(getBoost()));
{code}

So the problem isn't very simple: especially since I'm sure there are places using Term.text()
in other ways than
debugging...


                
> explain output is confusing when using trie fields (or any field type where the indexed
terms are not human readable)
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-3124
>                 URL: https://issues.apache.org/jira/browse/SOLR-3124
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 3.5
>            Reporter: Bill Bell
>
> using the trunk example schema containing...
> {noformat}
> <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/>
> <dynamicField name="*_ti" type="tint"    indexed="true"  stored="true"/>
> {noformat}
> ...and indexing the doc...
> {noformat}
> $ java -Ddata=args -jar post.jar '<add><doc><field name="id">HOSS</field><field
name="foo_ti">42</field></doc></add>'
> {noformat}
> ...results in a query for [foo_ti:42|http://localhost:8983/solr/select?q=foo_ti:42&start=0&rows=10&wt=json&debug.explain.structured=true&debugQuery=true&indent=true]
producing the following debug output...
> {noformat}
>   "debug":{
>     "rawquerystring":"foo_ti:42",
>     "querystring":"foo_ti:42",
>     "parsedquery":"foo_ti:42",
>     "parsedquery_toString":"foo_ti:`\b\u0000\u0000\u0000*",
>     "explain":{
>       "HOSS":{
>         "match":true,
>         "value":3.6741486,
>         "description":"weight(foo_ti:`\b\u0000\u0000\u0000* in 0) [DefaultSimilarity],
result of:",
>         "details":[{
>             "match":true,
>             "value":3.6741486,
>             "description":"fieldWeight in 0, product of:",
>             "details":[{
>                 "match":true,
>                 "value":1.0,
>                 "description":"tf(freq=1.0), with freq of:",
>                 "details":[{
>                     "match":true,
>                     "value":1.0,
>                     "description":"termFreq=1.0"}]},
>               {
>                 "match":true,
>                 "value":3.6741486,
>                 "description":"idf(docFreq=1, maxDocs=29)"},
>               {
>                 "match":true,
>                 "value":1.0,
>                 "description":"fieldNorm(doc=0)"}]}]}},
> ...
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message