manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CONNECTORS-1110) Component model doesn't seem to work properly - deletes components it shouldn't
Date Fri, 21 Nov 2014 13:06:34 GMT
Karl Wright created CONNECTORS-1110:
---------------------------------------

             Summary: Component model doesn't seem to work properly - deletes components it
shouldn't
                 Key: CONNECTORS-1110
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1110
             Project: ManifoldCF
          Issue Type: Bug
          Components: Framework crawler agent
    Affects Versions: ManifoldCF 1.7.2
            Reporter: Karl Wright
            Assignee: Karl Wright
             Fix For: ManifoldCF 1.8, ManifoldCF 2.0


>From a user:

"I wrote a simple TestConnector:
The processDocuments method looks like:

{code}
    public void processDocuments(String[] documentIdentifiers,
            String[] versions, IProcessActivity activities,
            DocumentSpecification spec, boolean[] scanOnly)
            throws ManifoldCFException, ServiceInterruption {

        int i = 0;
        for (String identifier : documentIdentifiers) {

            byte[] content1 = "test content 1".getBytes();
            byte[] content2 = "test content 2".getBytes();
            byte[] content3 = "test content 3".getBytes();


            RepositoryDocument rd1 = new RepositoryDocument();
            rd1.setBinary(new ByteArrayInputStream(content1), content1.length);

            RepositoryDocument rd2 = new RepositoryDocument();
            rd2.setBinary(new ByteArrayInputStream(content2), content2.length);


            RepositoryDocument rd3 = new RepositoryDocument();
            rd3.setBinary(new ByteArrayInputStream(content3), content3.length);


            System.out.println("process " + identifier);

            try {
                activities.ingestDocumentWithException(identifier, "comp1", versions[i], identifier+"/comp1",
rd1);
                activities.ingestDocumentWithException(identifier, "comp2", versions[i], identifier+"/comp2",
rd2);
                activities.ingestDocumentWithException(identifier, "comp3", versions[i], identifier+"/comp3",
rd3);
            } catch (IOException e) {
                e.printStackTrace();
            }

            i++;
        }

    }
{code}

For seeding the method getDocumentIdentifiers() returns a stream with a single document identifier
"testidentifier1".
Full Code available at [1].

But subsequent calls of ingestDocumentWithException result in deletions of a previously added
component.

{code}
job end 1416573910333(copmtest1) 0      1
document ingest         testidentifier1/comp3 OK 12     8
document deletion       testidentifier1/comp1 OK 0      2
document ingest         testidentifier1/comp2 OK 12     10
document deletion       testidentifier1/comp1 OK 0      3
document ingest         testidentifier1/comp1 OK 12     13
job start       1416573910333(copmtest1) 0      1
{code}

Only testidentifier1/comp2 and testidentifier1/comp3 exist in the output connection after
the job is finished."




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message