lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Dunning (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1908) Similarity javadocs for scoring function to relate more tightly to scoring models in effect
Date Sun, 13 Sep 2009 19:25:57 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754745#action_12754745
] 

Ted Dunning commented on LUCENE-1908:
-------------------------------------


bq. Thanks for reviewing this Ted.  ...  Might have helped to have better English than mine
for this

No problem reviewing this.  Thanks to you for doing the hard part of actually writing it.
 Being a critic is easy in comparison.

And you English is doing fine.  If you get the ideas down, a hundred people can chime in with
grammatical and spelling fixes.  And frankly, your English is so much better than any of my
other languages that I would never be brave enough to complain.

bq. I think all 3 formulas are required, just the gluing text should improve. ... I think
I know how to write it better in this sense.

Good point.  This is the essence of my grumpiness.

bq. Note that Searcher.maxDoc() is used instead of org.apache.lucene.index.IndexReader.numDocs()
because it is related to Searcher.docFreq(Term), i.e., when one is inaccurate, so is the other,
and in the same direction.

Actually, in this case, I think that the motivation is that maxDoc is very commonly exactly
correct and typically vastly more efficient.  As you say, when it is wrong, docFreq can also
be wrong the same way.  My suggestion would be this:
{quote}
Note that Searcher.maxDoc() is used instead of org.apache.lucene.index.IndexReader.numDocs()
because it is more efficient to compute and is pretty much correct except for when many documents
have been deleted.  In any case Searcher.docFreq(Term) is likely to have a similar problem.
{quote}

Regarding the proportional/related issue, I think that your language is fine.  At most, I
would suggest "varies with" instead of "related" because it is slightly stronger, but you
make the relationship abundantly clear in your text. 



> Similarity javadocs for scoring function to relate more tightly to scoring models in
effect
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1908
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1908
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Doron Cohen
>            Assignee: Doron Cohen
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1908.patch, LUCENE-1908.patch, LUCENE-1908.patch
>
>
> See discussion in the related issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message