Return-Path: Delivered-To: apmail-lucene-solr-user-archive@locus.apache.org Received: (qmail 65116 invoked from network); 23 Jul 2008 10:45:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 23 Jul 2008 10:45:47 -0000 Received: (qmail 1901 invoked by uid 500); 23 Jul 2008 10:45:45 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 1863 invoked by uid 500); 23 Jul 2008 10:45:45 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 1852 invoked by uid 99); 23 Jul 2008 10:45:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Jul 2008 03:45:44 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=MIME_BASE64_BLANKS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [24.40.8.145] (HELO pacdcimo01.cable.comcast.com) (24.40.8.145) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Jul 2008 10:44:48 +0000 Received: from ([24.40.15.92]) by pacdcimo01.cable.comcast.com with ESMTP id 5503620.8921407; Wed, 23 Jul 2008 06:43:35 -0400 Received: from PACORPEXCMB03.cable.comcast.com ([24.40.15.85]) by PACDCEXCSMTP03.cable.comcast.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 23 Jul 2008 06:43:35 -0400 Received: from 10.17.2.237 ([10.17.2.237]) by PACORPEXCMB03.cable.comcast.com ([24.40.15.85]) via Exchange Front-End Server webmail.comcast.com ([24.40.8.153]) with Microsoft Exchange Server HTTP-DAV ; Wed, 23 Jul 2008 10:43:34 +0000 User-Agent: Microsoft-Entourage/12.10.0.080409 Date: Wed, 23 Jul 2008 06:43:25 -0400 Subject: Re: spellchecker problems (bugs) From: Jonathan Lee To: "solr-user@lucene.apache.org" Message-ID: Thread-Topic: spellchecker problems (bugs) Thread-Index: AcjssO8bsIBCbPbveEqbv29NmRB7tw== In-Reply-To: <4885F7BD.3010801@modperlcookbook.org> Mime-version: 1.0 Content-type: multipart/mixed; boundary="B_3299640207_239236" X-OriginalArrivalTime: 23 Jul 2008 10:43:35.0434 (UTC) FILETIME=[F5535EA0:01C8ECB0] X-Virus-Checked: Checked by ClamAV on apache.org --B_3299640207_239236 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit I ran into a similar issue and found that I am able to get around it by: 1. Similar to what https://issues.apache.org/jira/browse/SOLR-622 will do, issue a spellcheck.reload=true command on the firstSearcher event to read any existing index off disk. Here are the relevant parts of my solrconfig.xml: myHandler *:* 0 true true ... spellcheck default name_spell ... text_spell 2. I believe there is a bug in IndexBased- and FileBasedSpellChecker.java where the analyzer variable is only set on the build command. Therefore, when the index is reloaded, but not built after starting solr, issuing a query with the spellcheck.q parameter will cause a NullPointerException to be thrown (SpellCheckComponent.java:158). Moving the analyzer logic to the constructor seems to fix the problem. I did not see a jira ticket for this (nor am I sure it's a real bug :), so I have attached a patch with these changes. Please let me know if I have overlooked something here and if I should attach this to an actual ticket. -Jonathan > From: Geoffrey Young > Reply-To: > Date: Tue, 22 Jul 2008 11:07:41 -0400 > To: > Subject: Re: spellchecker problems (bugs) > > > > Shalin Shekhar Mangar wrote: >> The problems you described in the spellchecker are noted in >> https://issues.apache.org/jira/browse/SOLR-622 -- I shall create an issue to >> synchronize spellcheck.build so that the index is not corrupted. > > I'd like to discuss this a little... > > I'm not sure that I want to rebuild the spelling index each time the > underlying data index changes - the process takes very long and my > updates are frequent changes to non-spelling related data. > > what I'd really like is for a change to my index to not cause an > exception. IIRC the "old" way of using a spellchecker didn't work like > this at all - I could completely rm data/index and leave data/spell in > place, add new data, not issue cmd=build and the spelling parts still > worked just fine (albeit with old data). > > not to say that SOLR-622 isn't a good idea (it is) but I don't really > think the entire solution is keeping the spellcheck index in sync. do > they need to be kept in sync for things not to implode on me? > > --Geoff --B_3299640207_239236--