manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luis Cabaceira (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CONNECTORS-1464) Improve S3 Repository Connector and Documentation
Date Wed, 04 Oct 2017 14:04:00 GMT
Luis Cabaceira created CONNECTORS-1464:
------------------------------------------

             Summary: Improve S3 Repository Connector and Documentation
                 Key: CONNECTORS-1464
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1464
             Project: ManifoldCF
          Issue Type: Improvement
          Components: Alfresco BFSI Output Connector
            Reporter: Luis Cabaceira
            Assignee: Luis Cabaceira
             Fix For: ManifoldCF next


The existing Amazon S3 repository connector currently does not have any documentation and
it has not been touched by a developer in quite a while.  This issue is to track the work
done to document, test and improve both the documentation and the existing S3 connector.

In the process of the development and tests of the Alfresco BFSI output connector we need
to verify what connectors we can support, crawling and processing documents from an S3 bucket
would be a feature that will have high demand on several migration use cases.

While testing the existing S3 repository connector, I got the connection working with my aws
account and key/secret pair. When i crawl my S3 bucket i'm getting this attributes on each
document object that is crawled.

Accept-Ranges value : [bytes]
ETag value : [32d56f4d707362c60bc18b5c62ac0c7b]
Last-Modified value : [Thu Apr 20 11:44:26 WEST 2017]
Content-Length value : [10955]
Content-Type value : [application/x-zip]

*Description of the attributes :*

ETag (entity tag - Type: String) is a hash of the object. The ETag reflects changes only to
the contents of an object, not its metadata. The ETag may or may not be an MD5 digest of the
object data. Whether or not it is depends on how the object was created and how it is encrypted
as described below:

Objects created by the PUT Object, POST Object, or Copy operation, or through the AWS Management
Console, and are encrypted by SSE-S3 or plaintext, have ETags that are an MD5 digest of their
object data.

Objects created by the PUT Object, POST Object, or Copy operation, or through the AWS Management
Console, and are encrypted by SSE-C or SSE-KMS, have ETags that are not an MD5 digest of their
object data.

If an object is created by either the Multipart Upload or Part Copy operation, the ETag is
not an MD5 digest, regardless of the method of encryption.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message