manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kamil Żyta <kamil.z...@pwr.edu.pl>
Subject Re: MCF 2 and Solr Cloud 5
Date Wed, 01 Apr 2015 15:50:09 GMT
I see only start/access/stop activities. Access denied is normal in my setup.
So how can I debug the problem?

K

On Wed, Apr 01, 2015 at 08:32:42AM -0700, Karl Wright wrote:
> Hi Kamil,
> Can you look at the simple history report, to verify whether manifoldcf
> is even attempting to post documents? It is possible that the solr
> connector doesn't count a change in collection name as requiring a
> reindex.
> 
> Karl
> 
> Sent from my Windows Phone
> From: Kamil Żyta
> Sent: 4/1/2015 11:08 AM
> To: user@manifoldcf.apache.org
> Subject: Re: MCF 2 and Solr Cloud 5
> I created new collection in solr, configure mcf for this collection:
> 'Connection working' but I cannot see any /update request from mcf in
> solr, only:
> 
> INFO  - 2015-04-01 15:03:16.442;
> org.apache.solr.update.DirectUpdateHandler2; start
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> INFO  - 2015-04-01 15:03:16.444;
> org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
> Skipping IW.commit.
> INFO  - 2015-04-01 15:03:16.445; org.apache.solr.core.SolrCore;
> SolrIndexSearcher has not changed - not re-opening:
> org.apache.solr.search.SolrIndexSearcher
> INFO  - 2015-04-01 15:03:16.445;
> org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
> INFO  - 2015-04-01 15:03:16.445;
> org.apache.solr.update.processor.LogUpdateProcessor;
> [dysk_shard1_replica1] webapp=/solr path=/update
> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS
> earcher=true&commit=true&softCommit=false&distrib.from=http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false}
> {commit=} 0 3
> INFO  - 2015-04-01 15:03:16.448;
> org.apache.solr.update.DirectUpdateHandler2; start
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> INFO  - 2015-04-01 15:03:16.449;
> org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
> Skipping IW.commit.
> INFO  - 2015-04-01 15:03:16.449; org.apache.solr.core.SolrCore;
> SolrIndexSearcher has not changed - not re-opening:
> org.apache.solr.search.SolrIndexSearcher
> INFO  - 2015-04-01 15:03:16.450;
> org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
> INFO  - 2015-04-01 15:03:16.450;
> org.apache.solr.update.processor.LogUpdateProcessor;
> [dysk_shard2_replica1] webapp=/solr path=/update
> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS
> earcher=true&commit=true&softCommit=false&distrib.from=http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false}
> {commit=} 0 2
> INFO  - 2015-04-01 15:03:16.456;
> org.apache.solr.update.processor.LogUpdateProcessor;
> [dysk_shard2_replica1] webapp=/solr path=/update/extract
> params={commit=true&wt=javabin&version=2} {commit=} 0 21
> 
> K
> 
> On Wed, Apr 01, 2015 at 10:53:39AM -0400, Karl Wright wrote:
> > "When I put 'esci' as collection name I get a error.
> > When I put 'collection1' I get 'Connection working' and no errors in logs
> > but
> > still no docs in solr."
> >
> > Hi Kamil,
> > Do you get the exception when you use "collection1" as the collection
> > name?  If not, then here's what I recommend:
> >
> > (1) Look at the Solr logs.  There should be an INFO message for each
> > document posted.  There is a URL in the message, and a document length, and
> > a result.  It would be great if you could include a couple of these for us
> > to look at.
> >
> > (2) If there are any exceptions etc. in the Solr logs, please send those
> > along as well.
> >
> > Offhand, this sounds like documents get posted properly but then ignored by
> > Solr.  There are a lot of potential reasons why that could be the case.
> > But if the documents are getting ignored, or if Tika is not successfully
> > extracting data, then we should be able to figure out why based on the Solr
> > logs.
> >
> > Thanks,
> > Karl
> >
> >
> >
> > On Wed, Apr 1, 2015 at 10:39 AM, Kamil Żyta <kamil.zyta@pwr.edu.pl> wrote:
> >
> > > Ok, see my first mail. When I put 'esci' as collection name I get a error.
> > > When I put 'collection1' I get 'Connection working' and no errors in logs
> > > but
> > > still no docs in solr.
> > >
> > > K
> > >
> > > On Wed, Apr 01, 2015 at 10:27:50AM -0400, Karl Wright wrote:
> > > > Hi Kamil,
> > > >
> > > > This is happening on the commit.  It looks to me like it's because you
> > > are
> > > > specifying a collection that doesn't actually exist:
> > > >
> > > > >>>>>>
> > > >     DocCollection col = getDocCollection(clusterState, collection);
> > > >
> > > >     DocRouter router = col.getRouter();
> > > > <<<<<<
> > > >
> > > > It's complaining because "col" is coming back null.
> > > >
> > > > Karl
> > > >
> > > >
> > > > On Wed, Apr 1, 2015 at 10:19 AM, Kamil Żyta <kamil.zyta@pwr.edu.pl>
> > > wrote:
> > > >
> > > > > ERROR 2015-04-01 16:09:24,032 (Job notification thread) - Unhandled
> > > > > SolrServerException: java.lang.NullPointerException
> > > > > org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unhandled
> > > > > SolrServerException: java.lang.NullPointerException
> > > > >         at
> > > > >
> > > org.apache.manifoldcf.agents.output.solr.HttpPoster.handleSolrServerException(HttpPoster.java:364)
> > > > >         at
> > > > >
> > > org.apache.manifoldcf.agents.output.solr.HttpPoster.commitPost(HttpPoster.java:308)
> > > > >         at
> > > > >
> > > org.apache.manifoldcf.agents.output.solr.SolrConnector.noteJobComplete(SolrConnector.java:610)
> > > > >         at
> > > > >
> > > org.apache.manifoldcf.crawler.system.JobNotificationThread.run(JobNotificationThread.java:121)
> > > > > Caused by: org.apache.solr.client.solrj.SolrServerException:
> > > > > java.lang.NullPointerException
> > > > >         at
> > > > >
> > > org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:873)
> > > > >         at
> > > > >
> > > org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:738)
> > > > >         at
> > > > >
> > > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
> > > > >         at
> > > > >
> > > org.apache.manifoldcf.agents.output.solr.HttpPoster$CommitThread.run(HttpPoster.java:1372)
> > > > > Caused by: java.lang.NullPointerException
> > > > >         at
> > > > >
> > > org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:520)
> > > > >         at
> > > > >
> > > org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:892)
> > > > >         at
> > > > >
> > > org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:795)
> > > > >         ... 3 more
> > > > >
> > > > > K
> > > > >
> > > > > On Wed, Apr 01, 2015 at 10:15:13AM -0400, Karl Wright wrote:
> > > > > > Hi Kamil,
> > > > > >
> > > > > > So you are still seeing a NullPointerException from
> > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient?  Can you
provide
> > > the
> > > > > > entire stack trace?
> > > > > >
> > > > > > Karl
> > > > > >
> > > > > >
> > > > > > On Wed, Apr 1, 2015 at 10:10 AM, Kamil Żyta <kamil.zyta@pwr.edu.pl>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Karl,
> > > > > > > same thing with trunk. Any advice?
> > > > > > >
> > > > > > > K
> > > > > > >
> > > > > > > On Wed, Apr 01, 2015 at 09:37:47AM -0400, Karl Wright wrote:
> > > > > > > > Hi Kamil,
> > > > > > > >
> > > > > > > > Solrj 5.0 changed massively from Solrj 4.x.  The work
to use
> > > Solrj
> > > > > 5.0
> > > > > > > has
> > > > > > > > been done on trunk.  You will need to check out and
build trunk
> > > in
> > > > > order
> > > > > > > to
> > > > > > > > use Solr 5.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Karl
> > > > > > > >
> > > > > > > > On Wed, Apr 1, 2015 at 9:23 AM, Kamil Żyta <
> > > kamil.zyta@pwr.edu.pl>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > > I set up solr 5 (Cloud) and mcf2, created core
in solr with 2
> > > > > shards
> > > > > > > and 2
> > > > > > > > > replicas:
> > > > > > > > > https://i.imgur.com/M05QTu7.png and created Output
> > > Connections in
> > > > > mcf.
> > > > > > > > > When I put 'esci' in 'Collection name' I got
error:
> > > > > > > > > Threw exception: 'Unhandled SolrServerException:
No live
> > > > > SolrServers
> > > > > > > > > available to handle this request:[
> > > > > http://10.26.26.29:8983/solr/esci,
> > > > > > > > > http://10.26.26.28:8983/solr/esci]'
> > > > > > > > > When I leave 'Collection name' empty I have 'Connection
> > > working'.
> > > > > > > > > Now when I start job, everything look good, worker
fetch docs,
> > > etc
> > > > > > > > > but I cannot see any docs in solr. Nothing in
logs except one
> > > line
> > > > > in
> > > > > > > > > worker
> > > > > > > > > console:
> > > > > > > > > [Thread-6476596] ERROR
> > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient -
> > > > > > > > > Request to collection  failed due to (0)
> > > > > > > java.lang.NullPointerException,
> > > > > > > > > retry? 0
> > > > > > > > > thanks for the advice.
> > > > > > > > >
> > > > > > > > > K
> > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >

Mime
View raw message