Return-Path: X-Original-To: apmail-manifoldcf-user-archive@www.apache.org Delivered-To: apmail-manifoldcf-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3963317FB8 for ; Wed, 1 Apr 2015 15:09:12 +0000 (UTC) Received: (qmail 89692 invoked by uid 500); 1 Apr 2015 15:08:50 -0000 Delivered-To: apmail-manifoldcf-user-archive@manifoldcf.apache.org Received: (qmail 89649 invoked by uid 500); 1 Apr 2015 15:08:50 -0000 Mailing-List: contact user-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@manifoldcf.apache.org Delivered-To: mailing list user@manifoldcf.apache.org Received: (qmail 89639 invoked by uid 99); 1 Apr 2015 15:08:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Apr 2015 15:08:50 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [156.17.193.131] (HELO z-mta4.wcss.wroc.pl) (156.17.193.131) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Apr 2015 15:08:24 +0000 MIME-version: 1.0 Content-transfer-encoding: 8BIT Content-disposition: inline Content-type: text/plain; charset=utf-8 Received: from spite.wcss.wroc.pl ([156.17.1.10]) by z-mta4.wcss.wroc.pl (Oracle Communications Messaging Server 7.0.5.34.0 64bit (built Oct 14 2014)) with ESMTPPSA id <0NM4005B7V7NKZ70@z-mta4.wcss.wroc.pl> for user@manifoldcf.apache.org; Wed, 01 Apr 2015 17:04:36 +0200 (CEST) X-PMX-Version: 6.0.3.2322014, Antispam-Engine: 2.7.2.2107409, Antispam-Data: 2015.4.1.145421 X-PMX-Spam: Gauge=IIIIIIII, Probability=8%, Report=' HTML_00_01 0.05, HTML_00_10 0.05, SUPERLONG_LINE 0.05, LINK_TO_IMAGE 0, REFERENCES 0, __ANY_URI 0, __BOUNCE_CHALLENGE_SUBJ 0, __BOUNCE_NDR_SUBJ_EXEMPT 0, __CD 0, __CP_URI_IN_BODY 0, __CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __FORWARDED_MSG 0, __HAS_FROM 0, __HAS_MSGID 0, __IN_REP_TO 0, __MIME_TEXT_ONLY 0, __MIME_VERSION 0, __REFERENCES 0, __SANE_MSGID 0, __SUBJ_ALPHA_NEGATE 0, __TO_MALFORMED_2 0, __TO_NO_NAME 0, __URI_NO_WWW 0, __URI_NS , __USER_AGENT 0' Sender: kamil.zyta@pwr.edu.pl Date: Wed, 01 Apr 2015 17:04:34 +0200 From: Kamil =?utf-8?B?xbt5dGE=?= To: user@manifoldcf.apache.org Subject: Re: MCF 2 and Solr Cloud 5 Message-id: <20150401150434.GL7782@spite.wcss.wroc.pl> References: <20150401132337.GG7782@spite.wcss.wroc.pl> <20150401141047.GI7782@spite.wcss.wroc.pl> <20150401141958.GJ7782@spite.wcss.wroc.pl> <20150401143911.GK7782@spite.wcss.wroc.pl> In-reply-to: User-Agent: Mutt/1.5.23 (2014-03-12) X-Virus-Checked: Checked by ClamAV on apache.org I created new collection in solr, configure mcf for this collection: 'Connection working' but I cannot see any /update request from mcf in solr, only: INFO - 2015-04-01 15:03:16.442; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} INFO - 2015-04-01 15:03:16.444; org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit. INFO - 2015-04-01 15:03:16.445; org.apache.solr.core.SolrCore; SolrIndexSearcher has not changed - not re-opening: org.apache.solr.search.SolrIndexSearcher INFO - 2015-04-01 15:03:16.445; org.apache.solr.update.DirectUpdateHandler2; end_commit_flush INFO - 2015-04-01 15:03:16.445; org.apache.solr.update.processor.LogUpdateProcessor; [dysk_shard1_replica1] webapp=/solr path=/update params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS earcher=true&commit=true&softCommit=false&distrib.from=http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false} {commit=} 0 3 INFO - 2015-04-01 15:03:16.448; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} INFO - 2015-04-01 15:03:16.449; org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit. INFO - 2015-04-01 15:03:16.449; org.apache.solr.core.SolrCore; SolrIndexSearcher has not changed - not re-opening: org.apache.solr.search.SolrIndexSearcher INFO - 2015-04-01 15:03:16.450; org.apache.solr.update.DirectUpdateHandler2; end_commit_flush INFO - 2015-04-01 15:03:16.450; org.apache.solr.update.processor.LogUpdateProcessor; [dysk_shard2_replica1] webapp=/solr path=/update params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS earcher=true&commit=true&softCommit=false&distrib.from=http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false} {commit=} 0 2 INFO - 2015-04-01 15:03:16.456; org.apache.solr.update.processor.LogUpdateProcessor; [dysk_shard2_replica1] webapp=/solr path=/update/extract params={commit=true&wt=javabin&version=2} {commit=} 0 21 K On Wed, Apr 01, 2015 at 10:53:39AM -0400, Karl Wright wrote: > "When I put 'esci' as collection name I get a error. > When I put 'collection1' I get 'Connection working' and no errors in logs > but > still no docs in solr." > > Hi Kamil, > Do you get the exception when you use "collection1" as the collection > name? If not, then here's what I recommend: > > (1) Look at the Solr logs. There should be an INFO message for each > document posted. There is a URL in the message, and a document length, and > a result. It would be great if you could include a couple of these for us > to look at. > > (2) If there are any exceptions etc. in the Solr logs, please send those > along as well. > > Offhand, this sounds like documents get posted properly but then ignored by > Solr. There are a lot of potential reasons why that could be the case. > But if the documents are getting ignored, or if Tika is not successfully > extracting data, then we should be able to figure out why based on the Solr > logs. > > Thanks, > Karl > > > > On Wed, Apr 1, 2015 at 10:39 AM, Kamil Żyta wrote: > > > Ok, see my first mail. When I put 'esci' as collection name I get a error. > > When I put 'collection1' I get 'Connection working' and no errors in logs > > but > > still no docs in solr. > > > > K > > > > On Wed, Apr 01, 2015 at 10:27:50AM -0400, Karl Wright wrote: > > > Hi Kamil, > > > > > > This is happening on the commit. It looks to me like it's because you > > are > > > specifying a collection that doesn't actually exist: > > > > > > >>>>>> > > > DocCollection col = getDocCollection(clusterState, collection); > > > > > > DocRouter router = col.getRouter(); > > > <<<<<< > > > > > > It's complaining because "col" is coming back null. > > > > > > Karl > > > > > > > > > On Wed, Apr 1, 2015 at 10:19 AM, Kamil Żyta > > wrote: > > > > > > > ERROR 2015-04-01 16:09:24,032 (Job notification thread) - Unhandled > > > > SolrServerException: java.lang.NullPointerException > > > > org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unhandled > > > > SolrServerException: java.lang.NullPointerException > > > > at > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster.handleSolrServerException(HttpPoster.java:364) > > > > at > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster.commitPost(HttpPoster.java:308) > > > > at > > > > > > org.apache.manifoldcf.agents.output.solr.SolrConnector.noteJobComplete(SolrConnector.java:610) > > > > at > > > > > > org.apache.manifoldcf.crawler.system.JobNotificationThread.run(JobNotificationThread.java:121) > > > > Caused by: org.apache.solr.client.solrj.SolrServerException: > > > > java.lang.NullPointerException > > > > at > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:873) > > > > at > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:738) > > > > at > > > > > > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) > > > > at > > > > > > org.apache.manifoldcf.agents.output.solr.HttpPoster$CommitThread.run(HttpPoster.java:1372) > > > > Caused by: java.lang.NullPointerException > > > > at > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:520) > > > > at > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:892) > > > > at > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:795) > > > > ... 3 more > > > > > > > > K > > > > > > > > On Wed, Apr 01, 2015 at 10:15:13AM -0400, Karl Wright wrote: > > > > > Hi Kamil, > > > > > > > > > > So you are still seeing a NullPointerException from > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient? Can you provide > > the > > > > > entire stack trace? > > > > > > > > > > Karl > > > > > > > > > > > > > > > On Wed, Apr 1, 2015 at 10:10 AM, Kamil Żyta > > > > wrote: > > > > > > > > > > > Hi Karl, > > > > > > same thing with trunk. Any advice? > > > > > > > > > > > > K > > > > > > > > > > > > On Wed, Apr 01, 2015 at 09:37:47AM -0400, Karl Wright wrote: > > > > > > > Hi Kamil, > > > > > > > > > > > > > > Solrj 5.0 changed massively from Solrj 4.x. The work to use > > Solrj > > > > 5.0 > > > > > > has > > > > > > > been done on trunk. You will need to check out and build trunk > > in > > > > order > > > > > > to > > > > > > > use Solr 5. > > > > > > > > > > > > > > Thanks, > > > > > > > Karl > > > > > > > > > > > > > > On Wed, Apr 1, 2015 at 9:23 AM, Kamil Żyta < > > kamil.zyta@pwr.edu.pl> > > > > > > wrote: > > > > > > > > > > > > > > > Hi, > > > > > > > > I set up solr 5 (Cloud) and mcf2, created core in solr with 2 > > > > shards > > > > > > and 2 > > > > > > > > replicas: > > > > > > > > https://i.imgur.com/M05QTu7.png and created Output > > Connections in > > > > mcf. > > > > > > > > When I put 'esci' in 'Collection name' I got error: > > > > > > > > Threw exception: 'Unhandled SolrServerException: No live > > > > SolrServers > > > > > > > > available to handle this request:[ > > > > http://10.26.26.29:8983/solr/esci, > > > > > > > > http://10.26.26.28:8983/solr/esci]' > > > > > > > > When I leave 'Collection name' empty I have 'Connection > > working'. > > > > > > > > Now when I start job, everything look good, worker fetch docs, > > etc > > > > > > > > but I cannot see any docs in solr. Nothing in logs except one > > line > > > > in > > > > > > > > worker > > > > > > > > console: > > > > > > > > [Thread-6476596] ERROR > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient - > > > > > > > > Request to collection failed due to (0) > > > > > > java.lang.NullPointerException, > > > > > > > > retry? 0 > > > > > > > > thanks for the advice. > > > > > > > > > > > > > > > > K > > > > > > > > > > > > > > > > > > > > > > > > > > > >