manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: ManifoldCF + Solr. No content field showing up in Solr
Date Wed, 16 Dec 2015 16:02:02 GMT
Hi Stephen,

That endpoint is meant to work with Solr Cell, which includes Tika.  My
guess is that you don't have Solr Cell configured or properly installed,
which is why the content field isn't getting populated.  The Solr logs
should give you some feedback if that's the case.

Karl


On Wed, Dec 16, 2015 at 10:59 AM, Corey, Stephen <COREYS@ecu.edu> wrote:

> The default for that connector, which is '/update/extract'.
>
>
> Stephen Corey
> Technology Consultant
> East Carolina University
> 252-737-2541
> coreys@ecu.edu
>
> ________________________________________
> From: Karl Wright [daddywri@gmail.com]
> Sent: Wednesday, December 16, 2015 8:40 AM
> To: user@manifoldcf.apache.org
> Subject: Re: ManifoldCF + Solr. No content field showing up in Solr
>
> Hi Stephen,
>
> Which update handler do you have your solr output connector configured to
> use?
>
> Karl
>
>
> On Wed, Dec 16, 2015 at 8:37 AM, Corey, Stephen <COREYS@ecu.edu<mailto:
> COREYS@ecu.edu>> wrote:
> I'm using MCF 2.1, and Solr 5.3.1, running in cloud mode. I'm using the
> web connector in MCF to crawl a website, and output using the Solr
> connector. Both applications are running on the same (RHEL) machine. The
> crawling seems to run fine, and I get all the documents showing up in Solr,
> except that the "content" field never gets added to Solr. I'm using the
> schemaless mode in Solr, so it'll add any fields that MCF sends to it. I'm
> not sure what is going wrong for me to not get the content field? I've
> added the field manually to Solr, and it still never gets populated. I've
> also tried adding a Tika transformation connector, and specified "extract
> everything" with the boilerplate setting, and still no luck.
>
> I think I'm missing something very simple, but what is it?
>
> Thanks, all
>

Mime
View raw message