I have checked out trunk from below location. made the build but i can still see its crawling the same file again and again.

svn checkout http://svn.apache.org/repos/asf/manifoldcf/trunk mcf-trunk

My configuration :
Nuxeo input connector


Max connections:    10
Connection type:    CMIS        
Authority group:    None (global authority)
Parameters:    
username=Administrator
password=********
binding=atom
protocol=http
server=localhost
port=8080
path=/nuxeo/atom/cmis
repositoryId=

output connector : solr connector with max connections 10. as far as i know output connector has no information about whether its same file or different.

job configuration :
Priority:     5    
Start method:     Start at beginning of schedule window
Schedule type:     Rescan documents dynamically
Minimum recrawl interval:     10 minutes    
Maximum recrawl interval:     Infinity
Expiration interval:     Infinity    
Reseed interval:     60 minutes
No scheduled run times
No forced metadata
Maximum hop count for link type 'child':     Unlimited
Hop count mode:     Delete unreachable documents


i have only one file in my nuxeo repository and i see after every 10 mins same file is sent to output connector again and again. i mean the call goes to addOrReplaceDocument method inside output connector even though there is no change to the file in nuxeo repository.

regards,
Jitu



On Tue, Jul 29, 2014 at 11:27 PM, Jitu <abjitu@gmail.com> wrote:
Thanks Karl and Prasad. its great to hear back so quickly. Thanks for the info it really helped me.

Thanks for the support

Regards,
Jitu


On Tue, Jul 29, 2014 at 10:41 PM, Karl Wright <daddywri@gmail.com> wrote:
Hi Jitu,

The bug is that the CMIS and Alfresco connectors reindexed documents even though they had not changed.  This is now corrected.

Karl



On Tue, Jul 29, 2014 at 12:28 PM, Jitu <abjitu@gmail.com> wrote:
Hi Prasad,
          Thanks for the reply. the bug says "The CMIS and Alfresco connectors currently do not look at scanOnly but should". does that mean cmis connector and alfresco connector crawls all the files and hands over to output connector no matter whether they are modified or not. Ideally it should crawl only if the file is modified else not. am i correct?

regards,
jitu





On Tue, Jul 29, 2014 at 9:19 PM, Paththamestrige Perera <prasad.srimal.perera@gmail.com> wrote:
Hello Jitu, I had the same issue and this was fixed with CONNECTORS-994 for the MCF 1.7
If you could checkout the mcf-trunk, it will work as expected.



On Tue, Jul 29, 2014 at 11:31 AM, Jitu <abjitu@gmail.com> wrote:
Hi,

I am a freelancer. for my current project i am using manifoldcf framework where i need to pull documents from cmis repository and output to solr connector.

But i noticed when i set job type as continuous. it is crawling all the files everytime no matter whether they are modified or not. but my requirement is to crawl the files again only if there is any modification.

how can i do it with manifoldcf.

Regards,
abjitu