I'm very grateful for the assistance. It'd be great to know the value
of DEFAULT_MAX_LENGTH in the documentation. I know the majority of
applications care more about precision than recall... but I know of a
lot of people using Lucene for high recall applications, too. Working
in high recall domains doesn't necessarily make us Lucene experts.
Many/most of the maximums/defaults used in Lucene can be changed and
have accessors available, which naturally highlights and documents
them to the user. PostingsHighlighter doesn't have such accessors, and
the treatment of DEFAULT_MAX_LENGTH in the javadocs is brief. I don't
know whether I just flat out missed it or assumed that
DEFAULT_MAX_LENGTH would be big enough, but, FWIW, the docs where
getNumMatches() was 0 on all Passages didn't strike me as being
particularly large.
Jon
On Tue, Oct 15, 2013 at 10:11 AM, Robert Muir <rcmuir@gmail.com> wrote:
> On Tue, Oct 15, 2013 at 9:59 AM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>> Well, unfortunately, this is a trap that users do hit.
>>
>> By requiring the user to think about the limit on creating
>> PostingsHighlighter, he/she would think about it and realize they are
>> in fact setting a limit.
>>
>> Silent limits are dangerous because you don't offhand know what's
>> wrong / why you see nothing getting highlighted.
>>
>>
>
> I already made my argument: for 99% of use cases the defaults are
> fine. In most cases highlighting is trying to summarize the document
> and something that deep just doesnt contribute much (see the default
> scoring model!). There is an optional ctor for the others doing expert
> things to specify the length.
>
> I don't think we should make APIs unusable because you think XYZ is a trap.
>
> Why not make DEFAULT_MAX_THREAD_STATES a required parameter to indexwriter?
>
> Hell lets make it so users have to supply all parameters to
> everything, so everything is like
> IndexWriter(int,int,int,int,int,int,int,int,int,int,int,int) and so
> on. Then you will be satisfied there are no traps, but it will be
> totally unusable.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
--
Jon Stewart, Principal
(646) 719-0317 | jon@lightboxtechnologies.com | Arlington, VA
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|