Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 60DD518ED9 for ; Mon, 29 Jun 2015 18:21:35 +0000 (UTC) Received: (qmail 35394 invoked by uid 500); 29 Jun 2015 18:21:30 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 35328 invoked by uid 500); 29 Jun 2015 18:21:30 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 35316 invoked by uid 99); 29 Jun 2015 18:21:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jun 2015 18:21:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of rajeshhazari@gmail.com designates 209.85.160.176 as permitted sender) Received: from [209.85.160.176] (HELO mail-yk0-f176.google.com) (209.85.160.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jun 2015 18:19:14 +0000 Received: by ykfy125 with SMTP id y125so122491722ykf.1 for ; Mon, 29 Jun 2015 11:21:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=Wxcl0KnKXHgs0ZIgx7ZYP5csRnByyuTtOVRhUUHZ/Mg=; b=C4jImZj+npMwenIhFjFIpPHhH1C93yVxGgvoar9HBzcjGg4IytPR9B+iKrsX31Ccsz pcFHrlp71aPp+5FTaHPaCyaC6rn7vbWMuvtwl43/caUGhh6gvINxHmc4ajHMNT7X9PzR d/5dRlF5eEFocxikoQJR4DA54+fgCQ2joqsU58c3W2dmf99j5jJHuTfRESdsSpWPbCk4 K6KXJtzRQlGnkxbFwNwoiqpZRo8c8JbibSIGeltlJQ6eAZ8yHwvR6arnDtXjg8AZ16Uy CvRObtSmAIBVOenJWDzXzExVQwJ2OI5QVk0yylHb/yMpIVgF0sELZ1QoLxYy9JW+y1D3 fHmA== MIME-Version: 1.0 X-Received: by 10.129.135.197 with SMTP id x188mr18839628ywf.110.1435602062646; Mon, 29 Jun 2015 11:21:02 -0700 (PDT) Received: by 10.37.0.195 with HTTP; Mon, 29 Jun 2015 11:21:02 -0700 (PDT) Date: Mon, 29 Jun 2015 14:21:02 -0400 Message-ID: Subject: solr suggester build issues From: Rajesh Hazari To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary=001a114f1176fba2360519ac2651 X-Virus-Checked: Checked by ClamAV on apache.org --001a114f1176fba2360519ac2651 Content-Type: text/plain; charset=UTF-8 Solr : 4.9.x , with simple solr cloud on jetty. JDK 1.7 num of replica : 4 , one replica for each shard num of shards : 1 Hi All, I have been facing below issues with solr suggester introduced in 4.7.x. Do any one have good working solution or buildOnCommit=true property is suggested not to use with index with more frequent softcommits as suggested in the documentation https://cwiki.apache.org/confluence/display/solr/Suggester So we have disabled this (buildOnCommit=false) and started using buildOnOptimize=true, which was not helping us to have latest document suggestion (with frequent softcommits), as hardly there was one optimize each day. (we have default optimize setting in solrconfig) So we have disabled buildOnOptimize (buildOnOptimize=false) As suggested in the documentation, as of now, we came up with cron jobs to build the suggester for every hour. These jobs are doing their job, i.e, we are having the latest suggestions available every hour, below are issues that we have this implementation. *Issue#1* : Suggest built url i.e, *http://$solrnode:8983/solr/collection1/suggest?suggest.build=true* if issued to one replica of solr cloud does not build suggesters in all of the replicas in solrcloud. Resolution: For which we have separate cron jobs on each of the solr instance having the build call to build the suggester, below is the raw pictorial representation of this impl (which is not the best implementation which has many flaws) *http://$solrnode:8983/solr/collection1/suggest?suggest.build=true* * |* * |-- suggestcron.job.sh (on solr1.aws.instance)* *http://$solrnode:8983/solr/collection1/suggest?suggest.build=true* * |* * |-- suggestcron.job.sh (on solr2.aws.instance)* * .......... similar for other solr nodes* * We will be coming up with single script to go this for all collection later.* we were bit happy that we are having a updated suggester in all of the instances, *which is not!* *The issue#2 the suggester built on all solr nodes were not consistent as the solr core in each solr replica have difference in max-docs and num-docs * *(which is quiet normal **with frequent softcommits , when updates mostly have the same documents updated with different data, **i guess , correct me if i'm wrong )* when we query curl -i "http:// $solrnode:8983/solr/liveaodfuture/suggest?q=Nirvana&wt=json&indent=true" one of the solr node returns { "responseHeader":{ "status":0, "QTime":0}, "suggest":{ "AnalyzingSuggester":{ "Nirvana":{ "numFound":1, "suggestions":[{ "term":"nirvana", "weight":6, "payload":""}]}}, "DictionarySuggester":{ "Nirvana":{ "numFound":0, "suggestions":[]}}}} /admin/luke/collection/ call status "index":{ "numDocs":90564, "maxDoc":94583, "deletedDocs":4019, .......} while other 3 solr node returns { "responseHeader":{ "status":0, "QTime":1}, "suggest":{ "AnalyzingSuggester":{ "Nirvana":{ "numFound":2, "suggestions":[{ "term":"nirvana", "weight":163, "payload":""}, * {* * "term":"nirvana cover",* * "weight":11,* * "payload":""}]}},* "DictionarySuggester":{ "Nirvana":{ "numFound":0, "suggestions":[]}}}} /admin/luke/collection/ call status on other 3 solr nodes... which have different maxDoc that the above solr node. "index":{ "numDocs":90564, "maxDoc":156760, ........} when i check the built time for suggest directory of the collection on each solr node have the same time ls -lah /mnt/solrdrive/solr/cores/*/data/suggest_analyzing/* -rw-r--r-- 1 root root 3.0M May 20 16:00 /mnt/solrdrive/solr/cores/collection1_shard1_replica3/data/suggest_analyzing/wfsta.bin Questions: Does the suggester built url i.e, *http://$solrnode:8983/solr/collection1/suggest?suggest.build=true *consider maxdocs or deleted docs also? Does the suggester built from i.e, *solr/collection1/suggest?suggest.build=true *is different from buildOnCommit=true property ? Do any one have better solution to keep the suggester current with contents in the index with more frequent softcommits? Does solr have any component like scheduler like cron scheduler to schedule the suggest build and scheduling the optimize on daily basis ? *Thanks,* *Rajesh**.* --001a114f1176fba2360519ac2651--