manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: ManifoldCF status of index
Date Mon, 14 Jul 2014 09:32:54 GMT
Hi Smitha,

Please send inquiries of this kind to user@manifoldcf.apache.org.

There are numerous resources describing the ManifoldCF schema, including
chapters 10, 11, and 12 of ManifoldCF In Action:
https://manifoldcfinaction.googlecode.com/svn/trunk/pdfs/

Quick answers:

(1) Yes, ManifoldCF keeps track of document status in the database.
(2) Yes, ManifoldCF knows whether the document was successfully indexed or
not.
(3) If indexing fails for some reason, ManifoldCF is aware of that and will
retry the document for a while (according to the nature of the failure and
the way the connector deals with that particular failure), and if the
problem persists, will either give up on the document, or abort the job.
The output connector implementation determines the exact path here.

If, however, you SPECIFICALLY break an index by, say, deleting it, or
changing the Solr schema, you have to tell MCF that you changed things out
from under it.  See the end-user documentation for a description of the
buttons available on every output connection view page for dealing with
modifications of this kind.

Thanks,
Karl



On Mon, Jul 14, 2014 at 3:10 AM, Smitha S <Smitha_S09@infosys.com> wrote:

>  Hi Karl,
>
>
>
> I using ManifoldCF for crawling and solr for indexing a set of documents.
>
>
>
> Could you please clarify some of my questions regarding MCF.
>
> ·         Once MCF sends the documents to solr for indexing is it tacking
> the indexing status anywhere in DB?
>
> ·         Will the MCF knows if the document is successfully indexed or
> not. Rds
>
> ·         If the indexing is failed due to some reason (may be network
> issues), will the MCF pick up the document again.
>
>
>
> Thanks & Regards,
>
> Smitha S
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
> for the use of the addressee(s). If you are not the intended recipient, please
> notify the sender by e-mail and delete the original message. Further, you are not
> to copy, disclose, or distribute this e-mail or its contents to any other person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
> every reasonable precaution to minimize this risk, but is not liable for any damage
> you may sustain as a result of any virus in this e-mail. You should carry out your
> own virus checks before opening the e-mail or attachment. Infosys reserves the
> right to monitor and review the content of all messages sent to or from this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>
>

Mime
View raw message