manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Emery <marc.em...@valtech.com>
Subject RE: Custom Transfo Connector - Strange behaviour
Date Mon, 03 Oct 2016 15:12:08 GMT
Hi Karl,

You’re right, removing the associated records have forced the complete pipeline.
I will investigate tomorrow on this and keep you informed.

Thanks a lot
marc



De : Karl Wright [mailto:daddywri@gmail.com]
Envoyé : lundi 3 octobre 2016 16:57
À : user@manifoldcf.apache.org
Objet : Re: Custom Transfo Connector - Strange behaviour

Hi Marc,

Sounds like you are running into the incremental nature of the platform.

The framework keeps track of a "version string" for each document from each connector involved
in the pipeline.  If the version string differs, then the framework knows that it must continue
pushing the document down the pipeline.  If not, then the framework may conclude that it is
unnecessary to continue.

I would look at how other similar transformation connectors handle the version string that
they return.  I suspect that your code may be missing a subtlety there.  You can also confirm
this picture by going to the output connection's view page and clicking the appropriate "forget"
button and running the job again. If you see ingestions, you will know that you have connector
problems that prevent MCF from doing its incremental logic properly.

Please let me know what you find.

Thanks,
Karl


On Mon, Oct 3, 2016 at 10:40 AM, Marc Emery <marc.emery@valtech.com<mailto:marc.emery@valtech.com>>
wrote:
Hi,

First of all, thanks for this amazing framework !

I’m running a 2.4 Command-driven multi-process manifoldcf, with a custom transformation
connector deployed in /connector-lib.
Once registered, I add the connector in first place after a web connector. Everything runs
fine the first time,

10-03-2016 14:35:01.171

document ingest (Solr)

https://library....

OK

0

11

10-03-2016 14:35:01.162

extract [transfo tika]

https://library...

OK

0

3

10-03-2016 14:35:01.151

enhance [transfo biblio]

https://library...

ACCEPTED

0

35

10-03-2016 14:35:01.150

process

https://library....

OK

12815

38

10-03-2016 14:35:00.009

fetch

https://library...

200

12815

1136




but on subsequent run, each url ingestion stops after a successful fetch, without reaching
downstream connectors.

10-03-2016 16:06:01.085

fetch

https://library...

200

13992

1250

10-03-2016 16:05:56.084

fetch

https://library...

200

15505

1090

10-03-2016 16:05:51.084

fetch

https://library...

200

12876

922




I can’t see any errors in the logs.

How could I debug this ? Thanks for your help.


Regards
marc

Mime
View raw message