manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <>
Subject [jira] [Created] (CONNECTORS-1110) Component model doesn't seem to work properly - deletes components it shouldn't
Date Fri, 21 Nov 2014 13:06:34 GMT
Karl Wright created CONNECTORS-1110:

             Summary: Component model doesn't seem to work properly - deletes components it
                 Key: CONNECTORS-1110
             Project: ManifoldCF
          Issue Type: Bug
          Components: Framework crawler agent
    Affects Versions: ManifoldCF 1.7.2
            Reporter: Karl Wright
            Assignee: Karl Wright
             Fix For: ManifoldCF 1.8, ManifoldCF 2.0

>From a user:

"I wrote a simple TestConnector:
The processDocuments method looks like:

    public void processDocuments(String[] documentIdentifiers,
            String[] versions, IProcessActivity activities,
            DocumentSpecification spec, boolean[] scanOnly)
            throws ManifoldCFException, ServiceInterruption {

        int i = 0;
        for (String identifier : documentIdentifiers) {

            byte[] content1 = "test content 1".getBytes();
            byte[] content2 = "test content 2".getBytes();
            byte[] content3 = "test content 3".getBytes();

            RepositoryDocument rd1 = new RepositoryDocument();
            rd1.setBinary(new ByteArrayInputStream(content1), content1.length);

            RepositoryDocument rd2 = new RepositoryDocument();
            rd2.setBinary(new ByteArrayInputStream(content2), content2.length);

            RepositoryDocument rd3 = new RepositoryDocument();
            rd3.setBinary(new ByteArrayInputStream(content3), content3.length);

            System.out.println("process " + identifier);

            try {
                activities.ingestDocumentWithException(identifier, "comp1", versions[i], identifier+"/comp1",
                activities.ingestDocumentWithException(identifier, "comp2", versions[i], identifier+"/comp2",
                activities.ingestDocumentWithException(identifier, "comp3", versions[i], identifier+"/comp3",
            } catch (IOException e) {



For seeding the method getDocumentIdentifiers() returns a stream with a single document identifier
Full Code available at [1].

But subsequent calls of ingestDocumentWithException result in deletions of a previously added

job end 1416573910333(copmtest1) 0      1
document ingest         testidentifier1/comp3 OK 12     8
document deletion       testidentifier1/comp1 OK 0      2
document ingest         testidentifier1/comp2 OK 12     10
document deletion       testidentifier1/comp1 OK 0      3
document ingest         testidentifier1/comp1 OK 12     13
job start       1416573910333(copmtest1) 0      1

Only testidentifier1/comp2 and testidentifier1/comp3 exist in the output connection after
the job is finished."

This message was sent by Atlassian JIRA

View raw message