manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jitu <>
Subject Re: question regarding manifoldcf
Date Tue, 29 Jul 2014 19:22:10 GMT
I have checked out trunk from below location. made the build but i can
still see its crawling the same file again and again.

svn checkout mcf-trunk

My configuration :
Nuxeo input connector

Max connections:    10
Connection type:    CMIS
Authority group:    None (global authority)

*output connector : solr *connector with max connections 10. as far as i
know output connector has no information about whether its same file or

*job configuration : *
Priority:     5
Start method:     Start at beginning of schedule window
Schedule type:     Rescan documents dynamically
Minimum recrawl interval:     10 minutes
Maximum recrawl interval:     Infinity
Expiration interval:     Infinity
Reseed interval:     60 minutes
No scheduled run times
No forced metadata
Maximum hop count for link type 'child':     Unlimited
Hop count mode:     Delete unreachable documents

i have only one file in my nuxeo repository and i see after every 10 mins
same file is sent to output connector again and again. i mean the call goes
to addOrReplaceDocument method inside output connector even though there is
no change to the file in nuxeo repository.


On Tue, Jul 29, 2014 at 11:27 PM, Jitu <> wrote:

> Thanks Karl and Prasad. its great to hear back so quickly. Thanks for the
> info it really helped me.
> Thanks for the support
> Regards,
> Jitu
> On Tue, Jul 29, 2014 at 10:41 PM, Karl Wright <> wrote:
>> Hi Jitu,
>> The bug is that the CMIS and Alfresco connectors reindexed documents even
>> though they had not changed.  This is now corrected.
>> Karl
>> On Tue, Jul 29, 2014 at 12:28 PM, Jitu <> wrote:
>>> Hi Prasad,
>>>           Thanks for the reply. the bug says "The CMIS and Alfresco
>>> connectors currently do not look at scanOnly but should". does that mean
>>> cmis connector and alfresco connector crawls all the files and hands over
>>> to output connector no matter whether they are modified or not. Ideally it
>>> should crawl only if the file is modified else not. am i correct?
>>> regards,
>>> jitu
>>> On Tue, Jul 29, 2014 at 9:19 PM, Paththamestrige Perera <
>>>> wrote:
>>>> Hello Jitu, I had the same issue and this was fixed with CONNECTORS-994
>>>> <> for the MCF
>>>> If you could checkout the mcf-trunk, it will work as expected.
>>>> On Tue, Jul 29, 2014 at 11:31 AM, Jitu <> wrote:
>>>>> Hi,
>>>>> I am a freelancer. for my current project i am using manifoldcf
>>>>> framework where i need to pull documents from cmis repository and output
>>>>> solr connector.
>>>>> But i noticed when i set job type as continuous. it is crawling all
>>>>> the files everytime no matter whether they are modified or not. but my
>>>>> requirement is to crawl the files again only if there is any modification.
>>>>> how can i do it with manifoldcf.
>>>>> Regards,
>>>>> abjitu

View raw message