manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: MCF 2 and Solr Cloud 5
Date Wed, 01 Apr 2015 17:01:45 GMT
The button works fine.  So the problem must be on the repository side.

Karl


On Wed, Apr 1, 2015 at 12:56 PM, Karl Wright <daddywri@gmail.com> wrote:

> If your simple history shows no documents being processed or indexed, then
> that's the problem, or at least one of them.
>
> I will try to confirm that the reindex button still works as it should.
>
> Karl
>
>
> On Wed, Apr 1, 2015 at 12:43 PM, Kamil Żyta <kamil.zyta@pwr.edu.pl> wrote:
>
>> On Wed, Apr 01, 2015 at 12:07:47PM -0400, Karl Wright wrote:
>> > Hi Kamil,
>> >
>> > If no attempts are being made to actually index documents, then no
>> > documents will be indexed.
>> >
>> > (1) What repository connection is this?  Can you try something simple
>> > first, like indexing from the file system?
>>
>> I use cifs, in 'Status and Job Management' Documents/Processed is 2598
>> so I think he can reach files but I can try with 'File systems' connector.
>>
>> > (2) I have confirmed that changing the collection does NOT trigger
>> > reindexing of documents.  That is a bug, but you can work around it by
>> > clicking the "Reindex all documents" button on the output connection's
>> view
>> > page after every change to the collection name.  Did you click that
>> button?
>>
>> yes, I clicked that button many times.
>>
>> K
>>
>> >
>> >
>> > On Wed, Apr 1, 2015 at 11:50 AM, Kamil Żyta <kamil.zyta@pwr.edu.pl>
>> wrote:
>> >
>> > > I see only start/access/stop activities. Access denied is normal in my
>> > > setup.
>> > > So how can I debug the problem?
>> > >
>> > > K
>> > >
>> > > On Wed, Apr 01, 2015 at 08:32:42AM -0700, Karl Wright wrote:
>> > > > Hi Kamil,
>> > > > Can you look at the simple history report, to verify whether
>> manifoldcf
>> > > > is even attempting to post documents? It is possible that the solr
>> > > > connector doesn't count a change in collection name as requiring a
>> > > > reindex.
>> > > >
>> > > > Karl
>> > > >
>> > > > Sent from my Windows Phone
>> > > > From: Kamil Żyta
>> > > > Sent: 4/1/2015 11:08 AM
>> > > > To: user@manifoldcf.apache.org
>> > > > Subject: Re: MCF 2 and Solr Cloud 5
>> > > > I created new collection in solr, configure mcf for this collection:
>> > > > 'Connection working' but I cannot see any /update request from mcf
>> in
>> > > > solr, only:
>> > > >
>> > > > INFO  - 2015-04-01 15:03:16.442;
>> > > > org.apache.solr.update.DirectUpdateHandler2; start
>> > > >
>> > >
>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
>> > > > INFO  - 2015-04-01 15:03:16.444;
>> > > > org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
>> > > > Skipping IW.commit.
>> > > > INFO  - 2015-04-01 15:03:16.445; org.apache.solr.core.SolrCore;
>> > > > SolrIndexSearcher has not changed - not re-opening:
>> > > > org.apache.solr.search.SolrIndexSearcher
>> > > > INFO  - 2015-04-01 15:03:16.445;
>> > > > org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
>> > > > INFO  - 2015-04-01 15:03:16.445;
>> > > > org.apache.solr.update.processor.LogUpdateProcessor;
>> > > > [dysk_shard1_replica1] webapp=/solr path=/update
>> > > >
>> > >
>> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS
>> > > > earcher=true&commit=true&softCommit=false&distrib.from=
>> > >
>> http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false
>> > > }
>> > > > {commit=} 0 3
>> > > > INFO  - 2015-04-01 15:03:16.448;
>> > > > org.apache.solr.update.DirectUpdateHandler2; start
>> > > >
>> > >
>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
>> > > > INFO  - 2015-04-01 15:03:16.449;
>> > > > org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
>> > > > Skipping IW.commit.
>> > > > INFO  - 2015-04-01 15:03:16.449; org.apache.solr.core.SolrCore;
>> > > > SolrIndexSearcher has not changed - not re-opening:
>> > > > org.apache.solr.search.SolrIndexSearcher
>> > > > INFO  - 2015-04-01 15:03:16.450;
>> > > > org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
>> > > > INFO  - 2015-04-01 15:03:16.450;
>> > > > org.apache.solr.update.processor.LogUpdateProcessor;
>> > > > [dysk_shard2_replica1] webapp=/solr path=/update
>> > > >
>> > >
>> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&waitSearcher=true&openS
>> > > > earcher=true&commit=true&softCommit=false&distrib.from=
>> > >
>> http://10.26.26.29:8983/solr/dysk_shard2_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false
>> > > }
>> > > > {commit=} 0 2
>> > > > INFO  - 2015-04-01 15:03:16.456;
>> > > > org.apache.solr.update.processor.LogUpdateProcessor;
>> > > > [dysk_shard2_replica1] webapp=/solr path=/update/extract
>> > > > params={commit=true&wt=javabin&version=2} {commit=} 0 21
>> > > >
>> > > > K
>> > > >
>> > > > On Wed, Apr 01, 2015 at 10:53:39AM -0400, Karl Wright wrote:
>> > > > > "When I put 'esci' as collection name I get a error.
>> > > > > When I put 'collection1' I get 'Connection working' and no errors
>> in
>> > > logs
>> > > > > but
>> > > > > still no docs in solr."
>> > > > >
>> > > > > Hi Kamil,
>> > > > > Do you get the exception when you use "collection1" as the
>> collection
>> > > > > name?  If not, then here's what I recommend:
>> > > > >
>> > > > > (1) Look at the Solr logs.  There should be an INFO message for
>> each
>> > > > > document posted.  There is a URL in the message, and a document
>> > > length, and
>> > > > > a result.  It would be great if you could include a couple of
>> these
>> > > for us
>> > > > > to look at.
>> > > > >
>> > > > > (2) If there are any exceptions etc. in the Solr logs, please
send
>> > > those
>> > > > > along as well.
>> > > > >
>> > > > > Offhand, this sounds like documents get posted properly but then
>> > > ignored by
>> > > > > Solr.  There are a lot of potential reasons why that could be
the
>> case.
>> > > > > But if the documents are getting ignored, or if Tika is not
>> > > successfully
>> > > > > extracting data, then we should be able to figure out why based
>> on the
>> > > Solr
>> > > > > logs.
>> > > > >
>> > > > > Thanks,
>> > > > > Karl
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Wed, Apr 1, 2015 at 10:39 AM, Kamil Żyta <
>> kamil.zyta@pwr.edu.pl>
>> > > wrote:
>> > > > >
>> > > > > > Ok, see my first mail. When I put 'esci' as collection name
I
>> get a
>> > > error.
>> > > > > > When I put 'collection1' I get 'Connection working' and
no
>> errors in
>> > > logs
>> > > > > > but
>> > > > > > still no docs in solr.
>> > > > > >
>> > > > > > K
>> > > > > >
>> > > > > > On Wed, Apr 01, 2015 at 10:27:50AM -0400, Karl Wright wrote:
>> > > > > > > Hi Kamil,
>> > > > > > >
>> > > > > > > This is happening on the commit.  It looks to me like
it's
>> because
>> > > you
>> > > > > > are
>> > > > > > > specifying a collection that doesn't actually exist:
>> > > > > > >
>> > > > > > > >>>>>>
>> > > > > > >     DocCollection col = getDocCollection(clusterState,
>> collection);
>> > > > > > >
>> > > > > > >     DocRouter router = col.getRouter();
>> > > > > > > <<<<<<
>> > > > > > >
>> > > > > > > It's complaining because "col" is coming back null.
>> > > > > > >
>> > > > > > > Karl
>> > > > > > >
>> > > > > > >
>> > > > > > > On Wed, Apr 1, 2015 at 10:19 AM, Kamil Żyta <
>> kamil.zyta@pwr.edu.pl
>> > > >
>> > > > > > wrote:
>> > > > > > >
>> > > > > > > > ERROR 2015-04-01 16:09:24,032 (Job notification
thread) -
>> > > Unhandled
>> > > > > > > > SolrServerException: java.lang.NullPointerException
>> > > > > > > > org.apache.manifoldcf.core.interfaces.ManifoldCFException:
>> > > Unhandled
>> > > > > > > > SolrServerException: java.lang.NullPointerException
>> > > > > > > >         at
>> > > > > > > >
>> > > > > >
>> > >
>> org.apache.manifoldcf.agents.output.solr.HttpPoster.handleSolrServerException(HttpPoster.java:364)
>> > > > > > > >         at
>> > > > > > > >
>> > > > > >
>> > >
>> org.apache.manifoldcf.agents.output.solr.HttpPoster.commitPost(HttpPoster.java:308)
>> > > > > > > >         at
>> > > > > > > >
>> > > > > >
>> > >
>> org.apache.manifoldcf.agents.output.solr.SolrConnector.noteJobComplete(SolrConnector.java:610)
>> > > > > > > >         at
>> > > > > > > >
>> > > > > >
>> > >
>> org.apache.manifoldcf.crawler.system.JobNotificationThread.run(JobNotificationThread.java:121)
>> > > > > > > > Caused by: org.apache.solr.client.solrj.SolrServerException:
>> > > > > > > > java.lang.NullPointerException
>> > > > > > > >         at
>> > > > > > > >
>> > > > > >
>> > >
>> org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:873)
>> > > > > > > >         at
>> > > > > > > >
>> > > > > >
>> > >
>> org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:738)
>> > > > > > > >         at
>> > > > > > > >
>> > > > > >
>> > >
>> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
>> > > > > > > >         at
>> > > > > > > >
>> > > > > >
>> > >
>> org.apache.manifoldcf.agents.output.solr.HttpPoster$CommitThread.run(HttpPoster.java:1372)
>> > > > > > > > Caused by: java.lang.NullPointerException
>> > > > > > > >         at
>> > > > > > > >
>> > > > > >
>> > >
>> org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:520)
>> > > > > > > >         at
>> > > > > > > >
>> > > > > >
>> > >
>> org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:892)
>> > > > > > > >         at
>> > > > > > > >
>> > > > > >
>> > >
>> org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:795)
>> > > > > > > >         ... 3 more
>> > > > > > > >
>> > > > > > > > K
>> > > > > > > >
>> > > > > > > > On Wed, Apr 01, 2015 at 10:15:13AM -0400, Karl
Wright wrote:
>> > > > > > > > > Hi Kamil,
>> > > > > > > > >
>> > > > > > > > > So you are still seeing a NullPointerException
from
>> > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient?
 Can
>> you
>> > > provide
>> > > > > > the
>> > > > > > > > > entire stack trace?
>> > > > > > > > >
>> > > > > > > > > Karl
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > On Wed, Apr 1, 2015 at 10:10 AM, Kamil Żyta
<
>> > > kamil.zyta@pwr.edu.pl>
>> > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Hi Karl,
>> > > > > > > > > > same thing with trunk. Any advice?
>> > > > > > > > > >
>> > > > > > > > > > K
>> > > > > > > > > >
>> > > > > > > > > > On Wed, Apr 01, 2015 at 09:37:47AM -0400,
Karl Wright
>> wrote:
>> > > > > > > > > > > Hi Kamil,
>> > > > > > > > > > >
>> > > > > > > > > > > Solrj 5.0 changed massively from
Solrj 4.x.  The work
>> to
>> > > use
>> > > > > > Solrj
>> > > > > > > > 5.0
>> > > > > > > > > > has
>> > > > > > > > > > > been done on trunk.  You will need
to check out and
>> build
>> > > trunk
>> > > > > > in
>> > > > > > > > order
>> > > > > > > > > > to
>> > > > > > > > > > > use Solr 5.
>> > > > > > > > > > >
>> > > > > > > > > > > Thanks,
>> > > > > > > > > > > Karl
>> > > > > > > > > > >
>> > > > > > > > > > > On Wed, Apr 1, 2015 at 9:23 AM,
Kamil Żyta <
>> > > > > > kamil.zyta@pwr.edu.pl>
>> > > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > Hi,
>> > > > > > > > > > > > I set up solr 5 (Cloud) and
mcf2, created core in
>> solr
>> > > with 2
>> > > > > > > > shards
>> > > > > > > > > > and 2
>> > > > > > > > > > > > replicas:
>> > > > > > > > > > > > https://i.imgur.com/M05QTu7.png
and created Output
>> > > > > > Connections in
>> > > > > > > > mcf.
>> > > > > > > > > > > > When I put 'esci' in 'Collection
name' I got error:
>> > > > > > > > > > > > Threw exception: 'Unhandled
SolrServerException: No
>> live
>> > > > > > > > SolrServers
>> > > > > > > > > > > > available to handle this request:[
>> > > > > > > > http://10.26.26.29:8983/solr/esci,
>> > > > > > > > > > > > http://10.26.26.28:8983/solr/esci]'
>> > > > > > > > > > > > When I leave 'Collection name'
empty I have
>> 'Connection
>> > > > > > working'.
>> > > > > > > > > > > > Now when I start job, everything
look good, worker
>> fetch
>> > > docs,
>> > > > > > etc
>> > > > > > > > > > > > but I cannot see any docs
in solr. Nothing in logs
>> > > except one
>> > > > > > line
>> > > > > > > > in
>> > > > > > > > > > > > worker
>> > > > > > > > > > > > console:
>> > > > > > > > > > > > [Thread-6476596] ERROR
>> > > > > > > > > > org.apache.solr.client.solrj.impl.CloudSolrClient
-
>> > > > > > > > > > > > Request to collection  failed
due to (0)
>> > > > > > > > > > java.lang.NullPointerException,
>> > > > > > > > > > > > retry? 0
>> > > > > > > > > > > > thanks for the advice.
>> > > > > > > > > > > >
>> > > > > > > > > > > > K
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > >
>> > > > > >
>> > >
>>
>
>

Mime
View raw message