lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2999) spellcheck-index is rebuilt on commit if optimized
Date Thu, 05 Jan 2012 02:20:39 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180130#comment-13180130
] 

Robert Muir commented on SOLR-2999:
-----------------------------------

One thing I would mention is that it seems the IndexBasedSpellChecker is always "rebuilt"
as in completely blown away and recreated.

Yet it does not store any frequency information (things like docFreq are always gathered from
the main index), it only indexes
term text in a special way. 

Because of this, I don't actually understand why solr always rebuilds it from scratch when
it supports updates... is there a reason for this?

It seems that 'rebuilding' an indexbasedspellchecker really doesn't help you that much, it
only cleans out some potentially dead terms 
that would be filtered out at correction time anyway (docFreq = 0), and this is even less
likely if you are using e.g. HighFrequencyDictionary.

Shouldn't the default be to just 'update' ? 
                
>  spellcheck-index is rebuilt on commit if optimized
> ---------------------------------------------------
>
>                 Key: SOLR-2999
>                 URL: https://issues.apache.org/jira/browse/SOLR-2999
>             Project: Solr
>          Issue Type: Bug
>          Components: spellchecker
>    Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5, 4.0
>            Reporter: Oliver Schihin
>            Priority: Minor
>             Fix For: 3.6, 4.0
>
>
> If an empty commit (i.e. without having posted new documents) is issued on an optimized
index, the spellcheck-index is rebuilt even though solrconfig defines buildOnOptimize=true,
not buildOnCommit=true.
> The problem was discovered on solr 4.0 but seems to happen on 3.x, too. Discussion and
further information can be found on the list (http://lucene.472066.n3.nabble.com/spellcheck-index-is-rebuilt-on-commit-tp3626492p3626492.html)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message